top of page

CASE STUDY

Revolutionizing Document Access for a Technology Services Firm

Revolutionized secure, efficient document access for a tech firm using context-aware solutions.

26 AUG, 2024

Untitled-(5).jpg

Challenge

Our client, a leading technology services firm, needed a secure and efficient platform for employees to access and summarize company documents using context search. The solution had to ensure that documents remained internal and were not exposed to any external platforms. 

Solution

Leveraging our expertise in cutting-edge technologies, we built upon LangChain as our orchestration framework. This choice provided the flexibility to switch between various embedders, databases, and LLMs, whether open-source or paid. It also allowed us to run multiple chains on the same vector data stores simultaneously, enhancing the accuracy of our solution.  

​

We deployed a self-hosted LLM solution using AWS SageMaker Jumpstart, with Falcon7B as our base model. Given the scattered nature of source files across SharePoint and S3 buckets, we utilized ready-to-use document loaders in LangChain. The files were chunked and vectorized using the Sentence Transformer Library and stored in a vector DB. For our use case, we selected an AWS Postgres server with the pgVector extension.

We programmatically mapped user permissions to the vectorized document metadata, ensuring restricted access based on user/group permissions. Additionally, we integrated document webhooks from SharePoint to track updates, keeping our database current with the latest content and access privileges.  

​

When users queried the model, it fetched relevant documents with scores using a RAG prompt and applied filters to compare access information specific to the user. The top ten records, sorted by matching score, were then summarized by our self-hosted LLM.

Architecture Diagram of LangChain Implementation

We also enabled chat history tracking by topics as set by the user. This approach guaranteed a streamlined and secure method for accessing summarized company information, significantly enhancing operational efficiency and data security. 

Results

  • User Topic Search: Employees quickly obtained detailed summaries from large documents, with no external information included. 

  • Access-Enabled Summarization: Employees received brief overviews of matching documents, limited by their access permissions. 

  • Reduced Processing Time: Our solution replaced the previous regex text matching process, which often missed relevant documents, with a context-aware approach that provided comprehensive summaries. 

This project showcases our skill, talent, and expertise in developing innovative, secure, and efficient document management solutions tailored to our clients' unique needs. 

bottom of page