Just as data mining techniques provide researchers with an automated method of gleaning unsuspected knowledge from structured electronic data (databases), parallel techniques are being developed to detect patterns and connections in unstructured electronic data (email, documents, websites). Text mining is in its infancy, but it offers hope for conquering the growing knowledge glut.
Using text clustering as a way to group document according to the overall similarities in their content. Features background information, examples and technical papers. http://www.sims.berkeley.edu/~hearst/sg-overview.html
An interface which attempts to show the user, graphically, the relationship between the words in the query and the documents retrieved. Features technical papers and commercial and other uses. http://www.sims.berkeley.edu/~hearst/tb-overview.html
Netherlands firm offers search and text mining solutions on a hosting basis. Features services, support, portfolio and contact information. http://www.eidetica.com/
Defines data mining, information access, and corpus-based computational linguistics, and then discusses the relationship of these to text data mining. The intent behind these contrasts is to draw attention to exciting new kinds of problems for computation http://www.sims.berkeley.edu/~hearst/papers/acl99/acl99-tdm.html
The Text Mining group at the University of Waikato in New Zealand. With a focus on Viterbi search and entropy-based methods the group has a compression feel to it. http://www.cs.waikato.ac.nz/~nzdl/textmining/
TextAnalyst is a unique text mining tool, using a semantic network for retrieval, clustering, classification, summarization, and natural language querying. http://www.megaputer.com/products/ta/index.php3
Provides NLP applications based on its proprietary VisualText technology. Product and service information, online software tour and documentation. http://www.textanalysis.com/
Links to reviews and analyses of text mining research. Features online presentations, white papers and other projects, papers, people and products. http://filebox.vt.edu/users/wfan/text_mining.html
Profiles the content of a web page, or from a content database, and uses data mining techniques to associate profiled content dynamically during a browsing session. http://www.megaputer.com/products/wa/index.php3
Products and services offering automated collection of data from public web sites. Features overviews and contact information. http://www.automated-info-solutions.com/
Bayesian based technology for mapping and mining concepts in large text collections. Features overview, services, technique, gallery and contact information. http://www.leximancer.com/index.html
Develops applications to search out and retrieve data from web pages and e-mail archives and provides turnkey services to find and retrieve information from the web, XML databases or stores of incoming e-mail. Features contact information. http://www.logical.btinternet.co.uk/webinf.htm
Automated system of retrieving and structuring information from open Internet sources and corporate warehouses. Features demo, news, FAQs and contact information. http://www.webobserver.co.uk
Offers solutions for understanding textual content and the automatic comprehension of text meaning by a computer. Features products, news and contact information. http://www.compris.com/
Internet-based set of tools for text analysis including categorisation and summarization of documents. Online demo. http://www.delft-cluster.nl/textminer/
Included in the famous SAS set of tools for quantitative data analysis, the module for text analysis includes clustering algorithms, document categorisation and data extraction. Overview and screenshots. http://www.sas.com/technologies/analytics/datamining/textminer/