Follow Gregory Piatetsky, No. 1 on LinkedIn Top Voices in Data Science & Analytics

KDnuggets Home » Software » Text Analysis, Text Mining and Information Retrieval

Text Analysis, Text Mining, and Information Retrieval Software

Commercial | online | free

On-line Text Mining / Text Analytics Tools

  • Wordle, a tool for generating "word clouds" from text that you provide.

Commercial Text Mining / Text Analytics Software

  • ActivePoint, offering natural language processing and smart online catalogues, based contextual search and ActivePoint's TX5(TM) Discovery Engine.
  • Aiaioo Labs, offering APIs for intention analysis, sentiment analysis and event analysis. Aiaioo online demo.
  • Alceste, a software for the automatic analysis of textual data (open questions, literature, articles, etc.)
  • AlchemyAPI, the world's leading text analysis service, processing billions of documents every month.
  • Anderson Analytics OdinText, complete text analytics software platform for consumer insights and customer service professionals.
  • Angoss Text Analytics, part of KnowledgeStudio, allows users to merge the output of unstructured, text-based analytics with structured data to perform data mining and predictive analytics.
  • Ascribe, offering a unique hybrid technology approach, blending natural language processing, machine learning and semi-automated coding tools, since 1999.
  • Attensity, offers a complete suite of Text Analytic applications, including the ability to extract "who", "what", "where", "when" and "why" facts and then drill down to understand people, places and events and how they are related.
  • Basis Technology, provides natural language processing technology for the analysis of unstructured multilingual text.
  • Clarabridge, text mining software providing end-to-end solution for customer experience professionals wishing to transform customer feedback for marketing, service and product improvements.
  • ClearForest, tools for analysis and visualization of your document collection.
  • Clustify, groups related documents into clusters, providing an overview of the document set and aiding with categorization.
  • Compare Suite, compares texts by keywords, highlights common and unique keywords.
  • Connexor Machine, discovers the grammatical and semantic information of natural language.
  • Copernic Summarizer, can read and summarize document and Web page text contents in many languages from various applications
  • Crossminder, natural language processing and text analytics (including cross-lingual text mining).
  • DataRPM, offering Natural Language Question Answering and Automatic Data Modeling.
  • Dhiti, providing an API for text-mining; can work on a document collection and mine out topics and concepts in realtime.
  • DiscoverText, a cloud-based text analytics solution with many powerful features, including an ActiveLearning machine classification engine. Provides valuable insights about employees, customers, products, news, and citizens.
  • dtSearch, for indexing, searching, and retrieving free-form text files.
  • Eaagle text mining software, enables you to rapidly analyze large volumes of unstructured text, create reports and easily communicate your findings.
  • Enkata, providing a range of enterprise-level solutions for text analysis.
  • Entrieva, patented technology indexes, categorizes and organizes unstructured text from virtually any source.
  • Expert System, using proprietary COGITO platform for the semantic comprehension of the language to do knowledge management of unstructured information.
  • Files Search Assistant, quick and efficient search within text documents.
  • IBM InfoSphere Warehouse Enterprise Edition, including advanced analytics, OLAP, data mining and text analytics.
  • IBM SPSS Predictive Analytics suite for data and text mining.
  • IKANOW Infinit.e, all-in-one big data analytics solution for harvesting and analyzing both structured and unstructured data, including social media data from Twitter, Facebook, and Google+.
  • Intellexer, natural language searching technologies for developing knowledge management tools, document comparison software and document summarization software, custom built search engines and other intelligent software.
  • ISYS Search Software, an enterprise search software supplier specializing in embedded search, text extraction, federated access solutions and text analytics.
  • IxReveal, offering uReveal "plug-in" advanced analytic platform and uReka! desktop "search and analyze" consumer product, based on patented text analytics methods.
  • KBSPortal, offers natural language processing as SaaS web service.
  • KNIME, an open source analytics platform which offers extensions for text analysis currently including Stanford NLP, Palladin, and Linguamatics.
  • Kwalitan 5 for Windows, uses codes for text fragments to facilitate textual search, display overviews, build hierarchical trees and more.
  • KXEN Text Coder (KTC), text analytics solution for automatically preparing and transforming unstructured text attributes into a structured representation for use in KXEN Analytic Framework.
  • Langsoft question-answering and content recognition/text attribution software, evaluation copy available.
  • Lexalytics, provides enterprise and hosted text analytics software to transform unstructured text into structured data.
  • Leximancer, makes automatic concept maps of text data collections
  • Lextek Onix Toolkit, for adding high performance full-text indexing search and retrieval to applications.
  • Lextek Profiling Engine, for automatically classifying, routing, and filtering electronic text according to user defined profiles.
  • Linguamatics, offering Natural language processing (NLP), search engine approach, intuitive reporting, and domain knowledge plug-in.
  • Luminoso, ontology-free text analytics solution, led by some of the top research scientists at the MIT Media Lab.
  • Megaputer Text Analyst, offers semantic analysis of free-form texts, summarization, clustering, navigation, and natural language retrieval with search dynamic refocusing.
  • Monarch, data access and analysis tool that lets you transform any report into a live database.
  • NetOwl (from SRA International), multilingual text and entity analytics: extracts entities, links, and events, performs name matching and identity resolution, assigns latitude/longitude to geographical references, translates names in foreign languages, and performs sentiment analysis.
  • NewsFeed Researcher, presents live multi-document summarization tool, with automatically-generated RSS news feeds.
  • Nstein, Enterprise Search and Information Access Technologies; On your public website, Nstein will guide your customers to the most relevant information more quickly than other solutions.
  • Ontotext provides semantic technology blending text mining, inference and a graph database to deliver optimized knowledge management, search and semantic analysis solutions.
  • Picturesafe semantic system categorizes and analyzes all this information completely automatically, recognizes content and similarities between different media, and dramatically speeding up journalistic and publishing research.
  • Plagiarism Software, free online check for plagiarism.
  • PolyVista, advanced listening, filtering, and analysis software and services to make sense of everything said about your company.
  • Power Text Solutions, extensive capabilities for "free text" analysis, offering commercial products and custom applications.
  • Readability Studio, offers tools for determining text readability levels.
  • Recommind MindServer, uses PLSA (Probablistic Latent Semantic Analysis) for accurate retrieval and categorization of texts.
  • SAS Text Miner, provides a rich suite of text processing and analysis tools.
  • Semantex from Janya Inc., enterprise-class information extraction system, detecting entities, attributes, relationships and events.
  • Skyttle API, a SaaS platform for sentiment analysis and keyword extraction. Supports English, French, German and Russian. See online demo at
  • SWAPit, Fraunhofer-FIT's text- and data analysis tool (updated version of DocMINER), offers visual text mining and retrieval capabilities, including search, term statistics, and summary; visualises semantic relationships among text documents.
  • TEMIS Luxid®, an Information Discovery solution serving the Information Intelligence needs of business corporations.
  • TeSSI®, software components that perform semantic indexing, semantic searching, coding and information extraction on biomedical literature.
  • Text Analysis Info, offering software and links for Text Analysis and more
  • Textalyser, online text analysis tool, providing detailed text statistics
  • Textalytics, a suite of text analytics APIs for semantic tagging of multilingual/multimedia content (opinion/sentiment/reputation analysis, classification, information extraction, etc.), with free usage up to 1M words/month.
  • TextPipe Pro, text conversion, extraction and manipulation workbench.
  • TextQuest, text analysis software
  • Treparel KMX Text Analytics delivers fast and powerful search, clear visual insights and advanced analytics for information professionals, information consumers and in OEM partnerships.
  • Readware Information Processor for Intranets and the Internet, classifies documents by content; provides literal and conceptual search; includes a ConceptBase with English, French or German lexicons.
  • Quenza, automatically extracts entities and cross references from free text documents and builds a database for subsequent analysis.
  • VantagePoint provides a variety of interactive graphical views and analysis tools with powerful capabilities to discover knowledge from text databases.
  • VisualText™, by TextAI is a comprehensive GUI development environment for quickly building accurate text analyzers.
  • VP Student Edition powerful text-mining and visualization tool for discovering knowledge in search results from science literature and other field-structured text databases.
  • Xanalys Indexer, an information extraction and data mining library aimed at extracting entities, and particularly the relationships between them, from plain text.
  • Wordstat, analysis module for textual information such as responses to open-ended questions, interviews, etc.

Many packages above offer free or limited trial versions.

Free and Open-Source Text Mining / Text Analytics Software

  • Data Science Toolkit, includes geo, text, NLP, and sentiment analysis tools.
  • Datumbox, a free API and many functions for Sentiment Analysis, Language Detection, Topic Classification and easily building intelligent apps.
  • FreeLing, an open source language analysis tool suite, GNU GPL.
  • GATE, a leading open-source toolkit for Text Mining, with a free open source framework (or SDK) and graphical development environment.
  •, a free online grammar check, for English.
  • IKANOW Infinit.e open source Community Edition, a scalable framework for collecting, storing, processing, retrieving, analyzing, and visualizing unstructured documents and structured records.
  • INTEXT, MS-DOS version of TextQuest, in public domain since Jan 2, 2003.
  • LingPipe is a suite of Java libraries for the linguistic analysis of human language.
  • Open Calais, an open-source toolkit for including semantic functionality within your blog, content management system, website or application.
  • RapidMiner Text Mining.
  • ReVerb: Open Information Extraction Software, extracts binary relationships like high-in(winter squash, vitamin c) without requiring any relation-specific training data.
  • S-EM (Spy-EM), a text classification system that learns from positive and unlabeled examples.
  • The Semantic Indexing Project, offering open source tools, including Semantic Engine - a standalone indexer/search application.


Sign Up