Deloitte’s WordsWorth Text-Mining Solution
Deloitte’s WordsWorth is a state-of-the-art, Cloud-capable text-mining offering from scanning through named-entity recognition, document mark-up, semantic search, text translation & summarization, table & invoice extraction.
Today’s digital world has resulted in a Cambrian explosion of documents, ever easier to produce and to transmit. While offices worldwide have indeed become increasingly “paperless”, paper reports have only partially been superseded by electronic data exchanges of structured, tabular data. Instead, unstructured narratives have migrated from paper to PDF, or equivalents. The volume of textual documents has soared, lifted by improved editor tools, automated text generation, and a dramatic increase in the options an author has to disseminate a message: emails, chats, blogs, social media, cloud drives, collaborative document-sharing suites, to name a few. Furthermore, multiple providers in each of these formats compete to best serve the need for humans to tell a story, to instruct or explain, or simply to express views.
Proliferation of documents, types and formats poses a significant challenge to the reader. It is increasingly difficult to discern useful signals from spurious noise, or even facts from opinion. Digitization of processes and businesses, the “always-on” reachability through mobile devices has raised expectations for quick results. There is simply not enough time to wade through the flood of documents, to discern which is important or which is reliable. Fortunately, machines armed with Natural Language Processing (NLP) algorithms can help. NLP promises efficiency, quality and exhaustive coverage in working with unstructured, textual documents.
Pre-trained on universally applicable language models and enhanced with casespecific vocabulary, the text-mining solution WordsWorth excels in accurate interpretation of text documents. It achieves this by combining the most advanced underlying methods from multiple cloud providers (AWS, Azure, GCP) with the flexibility of multiple, dedicated opensource algorithms. It offers users two means to interact with the functionality, either through the intuitive graphical user interface (GUI) or through dedicated Python libraries, which may be invoked via the command line or embedded within custom applications.
WordsWorth performs a wide spectrum of text-mining services: