Adopt­ing AI shouldn’t mean risk­ing data secur­ity. For organ­isa­tions hand­ling sens­it­ive know­ledge, every query is a mat­ter of trust. What if your teams could ask any intern­al ques­tion — and get an accur­ate, policy-com­pli­ant answer in seconds?

As McKin­sey reports, nearly 70% of enter­prises already use AI across their oper­a­tions. Yet only a frac­tion of organ­isa­tions deploy gen­er­at­ive mod­els securely with­in their intern­al envir­on­ments. At Codelab, we designed a solu­tion that makes this pos­sible — a secure LLM-based assist­ant that truly under­stands, pro­tects, and deliv­ers enter­prise knowledge.

Enter­prise con­text and secur­ity requirements

We ourselves faced chal­lenges in the con­text of intern­al know­ledge man­age­ment, extract­ing inform­a­tion fast from the ocean of doc­u­ments and as well as going through ana­lys­is and con­clu­sions on that base. With a vast repos­it­ory of sens­it­ive com­pany data and a grow­ing need for effi­cient, secure, and accur­ate responses, we sought a cut­ting-edge solu­tion to enhance our operations.

Chal­lenge: Secure and scal­able know­ledge retrieval

Our exist­ing intern­al know­ledge-shar­ing sys­tems were strug­gling to meet the fol­low­ing demands:

  • Effi­cient Access to Sens­it­ive Data: Employ­ees in dif­fer­ent roles, espe­cially those related to dir­ect work for pro­jects and pro­cesses inside the organ­iz­a­tion, required quick and secure access to crit­ic­al inform­a­tion stored in the company’s intern­al repositories.

  • Scalab­il­ity and Accur­acy: The sys­tem needed to handle a grow­ing volume of quer­ies while main­tain­ing high accur­acy and rel­ev­ance in responses.

  • Fall­back for Unavail­able Data: When intern­al data was insuf­fi­cient, the sys­tem needed to provide reli­able extern­al insights without com­prom­ising security

The real break­through wasn’t get­ting the mod­el to gen­er­ate answers — it was turn­ing messy intern­al know­ledge into some­thing struc­tured, search­able, and secure.

Patryk Skwar­can

Pro­ject Manager

Solu­tion Archi­tec­ture: LLM + RAG + AWS Deployment

Codelab developed and deployed a Vir­tu­al Assist­ant powered by the  OpenAImod­el and Retriev­al-Aug­men­ted Gen­er­a­tion (RAG) archi­tec­ture, hos­ted securely on AWS. The solu­tion was designed with the fol­low­ing key features:

The assist­ant was access­ible through a simple and intu­it­ive chat inter­face, enabling employ­ees and cus­tom­ers to inter­act effortlessly.

  • Secure Data Access:
    • The LLaMA-based assist­ant was integ­rated with the company’s sens­it­ive data repos­it­or­ies via a secure VPN connection.
    • Addi­tion­al access con­trol mech­an­ism ensured that only author­ized users could retrieve spe­cif­ic data, adher­ing to the company’s strict secur­ity policies.
  • Retriev­al-Aug­men­ted Gen­er­a­tion (RAG):
    • The assist­ant util­ized RAG to retrieve rel­ev­ant inform­a­tion from the company’s intern­al doc­u­ments, before gen­er­at­ing responses. This ensured that the answers were accur­ate, con­tex­tu­ally rel­ev­ant, and based on the company’s pro­pri­et­ary knowledge.
  • AWS Host­ing:
    • The entire solu­tion was hos­ted on AWS, lever­aging its scalab­il­ity, reli­ab­il­ity, and secur­ity features.
    • The solu­tion was also designed to switch between usage of CPUs vs GPUs power on AWS depend­ing on customer’s demands for either fast-pacing responses (in style of a live chat con­ver­sa­tion) or deep­er reas­on­ing with a slower response times.
    • The applic­a­tion is also ready to work off­line with usage of Llama mod­el. We also work based on OpenAI.
  • User-Friendly Inter­face:
    • The assist­ant was access­ible through a simple and intu­it­ive chat inter­face, enabling employ­ees and cus­tom­ers to inter­act effortlessly.

Meas­ur­able busi­ness results

▪️Annu­al Time Sav­ings: 2,500 Hours: Auto­ma­tion of repet­it­ive quer­ies reduced manu­al work­load by approx. 2,500 hours per year (equi­val­ent to 1.3 FTE).

▪️Annu­al Cost Sav­ings: €75,000: The assist­ant handled a grow­ing volume of quer­ies without com­prom­ising per­form­ance or accuracy.

▪️Auto­ma­tion Rate: 40–70%: The AI assist­ant autonom­ously handled 40–70% of recur­ring queries.

▪️Human Escal­a­tion Rate: <10%: Few­er than 10% of total quer­ies required human intervention.

▪️Know­ledge Retriev­al Pre­ci­sion: ≥80%: Pre­ci­sion rate achieved in LLM-based intern­al doc­u­ment search.


Why this archi­tec­ture works in high-secur­ity environments

Codelab imple­men­ted a secure and scal­able Vir­tu­al Assist­ant based on LLaMA, Retriev­al-Aug­men­ted Gen­er­a­tion (RAG), and AWS infra­struc­ture, designed to oper­ate with­in a con­trolled enter­prise envir­on­ment. The sys­tem addressed the iden­ti­fied know­ledge retriev­al chal­lenges and provides a found­a­tion for fur­ther intern­al AI cap­ab­il­ity development.

A sig­ni­fic­ant por­tion of the engin­eer­ing effort focused on data min­ing and doc­u­ment pre­par­a­tion. The RAG lay­er accoun­ted for less than 30% of the total imple­ment­a­tion work, with the major­ity ded­ic­ated to data struc­tur­ing, nor­mal­iz­a­tion, and index­ing optim­iz­a­tion. The sys­tem also exposes a secured API lay­er via FastAPI for con­trolled intern­al integration.

Tech­nic­al stack

  • LLaMA Mod­el & OpenAI: For nat­ur­al lan­guage under­stand­ing and generation.
  • Retriev­al-Aug­men­ted Gen­er­a­tion (RAG): For com­bin­ing intern­al doc­u­ment retriev­al with gen­er­at­ive AI capabilities.
  • AWS: For secure and scal­able hosting.
  • Apart from the typ­ic­al RAG imple­men­taiton we also have imple­men­ted non-trivi­al solu­tions like: Ques­tions reph­ras­ing, his­tory of the chat used in RAG, hybrid search

Scal­ing secure enter­prise AI: From intern­al assist­ant to pro­duc­tion-ready AI systems

The deploy­ment of this intern­al sys­tem demon­strates how AI-driv­en solu­tions can oper­ate with­in high-secur­ity enter­prise envir­on­ments. Codelab con­tin­ues to devel­op secure AI archi­tec­tures and cloud-based sys­tems to expand intern­al cap­ab­il­it­ies and sup­port pro­duc­tion-ready enter­prise deployments.