Working paper. Content reflects ongoing research and may change.

← Back to Working Papers
security data compliance draft

Proprietary Data Exposure in AI Systems

January 08, 2026

Working draft outlining how proprietary data can leak through AI systems, with mitigation ideas for enterprise use.

Status: Draft. References are provided for context and may be updated.

Proprietary Data Exposure in AI Systems

Threat Model (Actors, Assets, Boundaries)

Actors: The primary actor is the well-intentioned insider (employee or contractor) who uses AI tools in daily work. They are typically non-malicious but can inadvertently become a threat by pasting sensitive content into AI services. External adversaries also exist: e.g. attackers who attempt to extract confidential data from an AI model or compromise the AI provider’s systems.

Assets: The assets at risk are proprietary information (defined below) – essentially any non-public data that gives the enterprise value or must be protected by law. These include trade secrets (like source code or formulas), internal business communications, customer records, personally identifiable information (PII), protected health information (PHI), and export-controlled technical data.

Boundaries: A critical boundary lies between the enterprise’s secure environment and external AI infrastructure. When an employee uses a cloud-based AI (like a public chatbot), proprietary data crosses from the controlled corporate network to an external system outside the company’s direct oversight. Within the AI provider’s domain, data may be stored, logged, or even become part of the model’s knowledge. A secondary boundary exists between accounts: if an employee uses a personal AI account (instead of an enterprise-managed account), corporate data moves to a space completely outside corporate identity management[1]. This undermines governance, as the data is no longer protected by the enterprise’s access controls.

Illustrative Data Flow

The diagram below outlines a typical data flow when an employee uses an AI tool. It shows how sensitive data can traverse from the user's environment to the AI model and potentially leak to others:

sequenceDiagram
    participant E as Employee
    participant C as Chat Interface (Web/IDE)
    participant S as AI Cloud Service
    participant M as ML Model
    participant O as Other User
    E->>C: Enter prompt with proprietary info
    C->>S: Transmit data via API or extension
    Note right of S: Data may be logged or stored
    S->>M: Process prompt with ML model
    M-->>S: Generated response
    S-->>C: Return answer
    C-->>E: Display answer
    Note over E,S: Proprietary data now outside corporate boundary
    O->>S: External user queries model
    S->>M: Process external query
    M-->>O: Output could include memorized proprietary info
    Note over M,O: Model training may expose prior data in responses

In summary, the threat model acknowledges that employees under pressure (or lacking clear guidance) can act as inadvertent leakers of sensitive assets, and once data crosses the enterprise boundary into an AI system, it may be retained or reproduced outside authorized channels. Attackers might then exploit this by querying models for memorized secrets or targeting the AI provider's data stores.

Defining "Proprietary Information"

Modern enterprises maintain a wide spectrum of proprietary information that must remain confidential. This spans multiple categories, each often governed by specific legal definitions or industry standards:

Trade Secrets & Intellectual Property

Any information that derives economic value from not being publicly known and is subject to secrecy efforts[2]. Classic examples are source code, algorithms, product designs, manufacturing processes, proprietary formulas, and business methods. Under the Uniform Trade Secrets Act (UTSA), trade secrets include "information, including a formula, pattern, compilation, program, device, method, technique, or process" that has economic value from secrecy and is protected by reasonable measures[2]. For instance, a software company's unreleased source code or a tech firm's hardware specifications are trade secrets. These often overlap with intellectual property (IP) assets like patentable inventions or copyrighted works (though patented info becomes public, so trade secret status usually applies to un-patented proprietary know-how).

Confidential Business Information

Non-public business materials that give a competitive edge. This includes internal communications (emails, chat logs, meeting minutes), strategic plans and roadmaps, marketing research, pricing strategies, merger or acquisition plans, supplier contracts, and so on. Even if not formally "trade secrets," such information is typically classified as confidential or "internal use only" within data classification schemes[3][4]. For example, a company's 5-year product strategy or an internal memo about a client project would fall here. These might not have legal definitions but are often covered by Non-Disclosure Agreements (NDAs) and corporate policy.

Customer and Partner Data

Data entrusted to the company by customers or business partners under confidentiality. This includes client databases, order histories, support tickets, and any integration data. Leaking customer data not only breaks trust but may invoke privacy laws if it contains personal information.

Personally Identifiable Information (PII)

Any data that can identify a specific individual. PII is defined broadly in privacy standards and laws. It includes obvious identifiers like names, addresses, emails, phone numbers, Social Security numbers, as well as combinations of data that can pinpoint a person[5]. For example, a user profile with name and contact info, or a list of customers with their IDs, is PII. Many jurisdictions (GDPR, CCPA, etc.) regulate PII, requiring explicit consent or safeguards for processing and transfer. In corporate terms, PII about employees or customers is usually classified as sensitive.

Protected Health Information (PHI)

In healthcare contexts, PHI refers to identifiable health-related information (medical records, diagnoses, treatment details, insurance information, etc.) that is protected under laws like HIPAA[6][7]. PHI includes any health information tied to an individual (e.g. a patient's test results with their name). Under HIPAA, such data cannot be disclosed to any third party (including an AI service) without a Business Associate Agreement in place. We emphasize: OpenAI's ChatGPT is not HIPAA-compliant by default, since OpenAI will not sign a HIPAA Business Associate Agreement for its consumer services[8]. This means any PHI entered into a public AI tool is effectively an illegal disclosure under HIPAA.

Financial and Accounting Data

Sensitive financial records (profit/loss statements before public release, salary data, sales forecasts, bank account details, etc.). Regulatory standards like SOX or PCI-DSS (for payment data) mandate safeguarding this category. For example, an internal financial projection spreadsheet is proprietary data.

Regulated Technical Data (e.g. ITAR-controlled)

Technical data related to defense or export-controlled technology. Under ITAR (International Traffic in Arms Regulations), any information required for the design, development, production, or use of defense articles is restricted[9]. This can include blueprints, schematics, software, or manuals for military systems[10]. Export laws make it illegal to share such data with foreign nationals or overseas systems without a license. Notably, uploading ITAR-controlled drawings or code to a public AI service would constitute an unlicensed export – a serious violation. ITAR data must remain in controlled environments; a public LLM with servers or staff in other countries would be disallowed[11]. Companies in defense or aerospace have strict rules to prevent this exact scenario.

In short, proprietary information encompasses all data that an organization considers confidential and valuable, whether by law or by competitive necessity. According to data classification examples, this ranges from personal data (PII/PHI) to financial records to trade secrets and beyond[12]. Any such information, if exposed, can harm the company (loss of competitive edge, legal penalties, reputational damage) or violate obligations to others. We will use "proprietary" and "sensitive" interchangeably to refer to this protected data. The above definitions will guide what types of information we are concerned with as we analyze how they end up in AI systems. Data Exposure Pathways into AI Systems How does proprietary data actually enter AI models during routine workplace usage? There are several systemic pathways, from deliberate user inputs to background processes. We categorize the main technical vectors by which sensitive enterprise data can flow into external AI systems: Direct User Input via Web Chat Interfaces: The most straightforward path is an employee typing or pasting data into a chatbot web UI (such as the ChatGPT website). This is voluntary input – the user actively decides to use the AI and provides company information as part of their prompt. For example, an engineer might paste a block of source code into ChatGPT asking for help debugging it, or a salesperson might paste an email draft containing client details for the AI to refine. This scenario played out famously at Samsung: in three separate instances, Samsung employees entered sensitive code and an internal meeting transcript into ChatGPT[13]. One engineer sought a fix for an error by pasting semiconductor equipment source code; another tried to optimize code for defective equipment; a third transcribed a private meeting recording and asked ChatGPT to summarize it[14]. In each case, confidential data (source code and meeting notes) was consciously fed into the AI’s web interface. Because the chatbot is so capable, employees may do this without fully realizing the risk. Any web-based AI interface that accepts text is a potential ingress point for sensitive data. Users often use these tools to summarize documents, get code suggestions, translate texts, or generate reports, and in doing so they supply internal content to the AI. IDE Extensions and Coding Assistants: Beyond web UIs, many employees interact with AI through integrated development environment (IDE) extensions or productivity plugins. For instance, developers using tools like GitHub Copilot, ChatGPT VS Code extensions, or similar code assistants are effectively sending code snippets and context to an AI API in real time as they code. These IDE integrations make it seamless – code is shared in the background to get suggestions or explanations. If not configured carefully, they can expose proprietary source code or configuration files. In fact, concerns have been raised that AI pair-programmers could leak confidential code. (GitHub has addressed this by promising that Copilot for Business does not use customers’ code to train models[15], and by offering a setting to not retain prompts[16]. However, the code still leaves the local environment to get an inference result.) There is also risk from third-party extensions: a malicious or poorly built extension could collect far more data than intended. Researchers have found cases of Visual Studio Code extensions leaking secret keys and data in their telemetry or packages[17][18]. Thus, whenever an AI feature is embedded in workplace software (IDE, office suite, chat app), it creates a channel through which internal text or code can flow out to the AI provider. Many companies are only beginning to scrutinize these channels. Automated API Call Chains: Some data exposure is machine-to-machine. Modern enterprise apps increasingly call large language model APIs under the hood. For example, a CRM system might call an LLM API to auto-summarize customer notes; an IT service desk might use an AI API to draft ticket responses; or a business intelligence tool might feed data into an LLM to generate a natural language report. These API integrations can unintentionally funnel regulated or sensitive data to an external AI service. A developer might hook up an internal database to an AI for analysis, not realizing that every query sends real customer data outside. One concrete case: A mental health platform integrated GPT-3 to help draft therapeutic responses, which involved sending anonymized but sensitive user messages to the API – raising ethical flags about patient privacy. In general, if enterprises do not closely govern software development, teams may chain APIs for convenience (e.g. sending form inputs to OpenAI’s API to get a completion) and inadvertently include fields like names, emails, or financial figures. Incidental data capture also occurs if logging or middleware isn’t careful – e.g. an API gateway might log full request payloads (including the sensitive text) and those logs are stored externally. API-based exposure is harder for the enterprise to notice than a user copy-pasting into ChatGPT, because it can happen in the back-end of tools. Browser Integrations, Clipboard and Autofill Features: Employees often install browser extensions or use built-in features that incorporate AI. For instance, an extension that summarizes web pages will send the page content to an AI. If an employee is on an internal web application (behind login) and uses such a tool, they might inadvertently send confidential page data (like an internal dashboard or client account info) to the AI service. Similarly, clipboard integration can be risky: some AI assistants monitor the user’s clipboard so they can say “Hey, I see you copied a log file, want me to analyze it?”. This means the extension or app automatically reads whatever the user copies – which could be a password, source code, or a private document – and potentially sends it off for analysis. Autofill or “smart compose” features (like AI completing your sentences in email) might be drawing on context from the email thread, effectively uploading portions of your internal conversation to a cloud model to craft the next sentence. These ambient features blur the line between voluntary input and background capture. The key risk is unintentional exposure: the employee isn’t deliberately pasting sensitive info into the AI, but the tooling scoops it up for convenience. If the browser or OS-level AI assistant lacks strict permissions, it may have too much access. For example, Microsoft’s Windows Copilot (AI assistant) has raised questions about how it might access local data. In summary, AI that hooks into the user’s environment (browser, clipboard, files) can incidentally vacuum up proprietary data if not carefully sandboxed. Logging, Telemetry and “Background” Data Flows: Even if a user doesn’t directly provide data to an AI, the software and systems around the AI might capture it. Client-side logs: Many applications keep usage logs or send telemetry (for product improvement or crash diagnostics). If an enterprise has an AI tool running on user devices, those logs might include transcripts of AI interactions or snippets of user-provided data, which could then be transmitted to the vendor’s servers. For instance, a ChatGPT desktop app could log the last 20 prompts for debugging purposes and upload that. Server-side logs: On the AI provider side, logs and analytics can also be a pathway. OpenAI has stated it may retain API request data for 30 days for abuse monitoring[19]. If those requests contain proprietary info, they now reside in log files. A bug in March 2023 illustrated this risk: a glitch in OpenAI’s system exposed some users’ conversation titles to other users, and even leaked snippets of payment info, due to a caching error[20]. Those titles were presumably stored for user experience purposes, but they accidentally bled over session boundaries. This is a case where background data (chat history metadata) became visible to others – a clear failure of isolation. Crash reports are another concern: if an AI system crashes or hits an error, a snapshot of memory or the user’s last input might be sent to developers. Without scrubbers, that snapshot could contain confidential text. In essence, every layer of the AI stack that logs or transmits diagnostics can turn into a leakage vector if not handled properly.

Voluntary vs Incidental Capture

It's important to distinguish between voluntary inputs and incidental data capture. Voluntary input is when an employee knowingly sends data to the AI (the first two bullets above). Incidental capture is when the system architecture scoops up data in the background (the last two bullets). Both are problematic: voluntary leaks stem from human choices (often driven by the incentives we discuss later), whereas incidental leaks stem from design decisions (often invisible to the user). Effective risk management must address both. For example, an organization can train employees not to paste secrets into ChatGPT (voluntary), but if the company then installs a network monitoring AI that unintentionally forwards emails to a cloud service, that's an architectural issue (incidental). We will see that many documented failures involve the obvious pathway (employees pasting secrets), but there's growing worry about the less visible pathways (integrations and telemetry).

Stack-Layer Risk Analysis

To truly understand systemic failures, we need to analyze each layer of the AI usage stack, from user experience down to the model, and identify how protections can fail. We adopt a layered approach, examining risk factors at: the user interface (UX) level, the identity/authentication level, the prompt/model context handling, the AI provider’s back-end, and the client vs server side divide.

UX Defaults and Design The design of AI interfaces often prioritizes usability and immediate utility over security. By default, most AI chatboxes are an empty text field inviting you to paste anything. There are usually no prominent warnings or friction to discourage sensitive data input. For instance, when users open ChatGPT’s interface, they see a friendly prompt like “Ask me anything” – there is no built-in data classification filter that stops “Hey, this looks like source code, are you sure?” From a UX perspective, the lack of context or labels can lead employees to treat the AI like a trusted internal tool. They may falsely assume “it’s just software, it must be safe to use,” especially if the UI is professional-looking. Moreover, features like code formatting, file uploads, or image inputs can encourage richer data sharing. UX convenience can thus override caution. Some failures: employees pasting entire documents or logs in one go because the UI allows it. Contrast this with, say, email DLP systems that pop up a “This looks sensitive, are you sure?” – AI tools rarely have that in 2023. The lack of real-time feedback (“this prompt was sent to an external server”) makes the boundary crossing invisible. Unless an organization has modified the UI or deployed browser plugins to flag company-confidential content, the default UX can be a one-way funnel out. The first time an employee realizes their mistake might be when it’s too late. It’s telling that many companies started adding banner warnings on internal networks (e.g. “Reminder: Do not input confidential data into ChatGPT”) precisely because the tool itself didn’t provide those cues. In short, the UX layer risk is over-sharing by design – nothing in the interface prevents or even visibly logs the act of sending sensitive data out.

Identity, Authentication & Account Scoping This layer concerns who is accessing the AI and under what account context. A major risk is employees using personal or unregulated accounts to do work. If an employee signs up with their personal email on ChatGPT, anything they input goes into an account the company doesn’t control. That means no enterprise ability to audit or delete the data, and no guarantee of contractual protections. According to one report, 71.6% of generative AI usage in companies was via unmanaged, personal accounts rather than company-approved accounts[21]. This “shadow AI” usage bypasses enterprise Single Sign-On and DLP monitors. Even when companies deploy official AI tools, workers might still prefer the unfettered personal version if it’s more powerful or convenient. Additionally, inadequate scoping of authentication can cause cross-project leaks. For example, if a company shares one API key among many developers, data from multiple projects might all get logged under that key at the provider side, muddying access control. Or if an organization uses a third-party chatbot that isn’t multi-tenant aware, one client’s data might accidentally appear to another (this happened with the ChatGPT history bug, essentially an auth/session isolation failure[20]). Account scope also relates to tiered access: ideally, an enterprise would have an “enterprise account” with the AI provider where data is siloed, but if employees use free accounts, their data might get pooled into the general model training set. The risk at this layer is that policies tied to identity (like AUPs or role-based access) get circumvented by the lack of enterprise identity linkage. If Bob the engineer uses ChatGPT under his personal Gmail, corporate rules don’t technically apply in that context and the company has no visibility. Furthermore, it’s often easier to register for a new SaaS account than to request an official tool – a structural incentive problem. Summarily, authentication failures (e.g. using wrong accounts, or session leaks) mean sensitive data lands in places it doesn’t belong and cannot be traced or purged by the organization.

Prompt Construction & Dynamic Context Windows This layer concerns how data is packaged into prompts and how the AI’s context memory might carry data forward. Modern AI assistants keep a conversation context – earlier messages stay in a “window” of memory for subsequent replies. This can lead to unexpected retention of proprietary info. For example, if an employee first asks “Summarize our Q3 strategy” (providing internal info) and then later in the chat asks an unrelated question, there’s a risk the model might draw on the earlier confidential context in a later answer. If that later answer is shared or seen outside, it could leak info indirectly. Prompt construction also involves hidden or system prompts: some integrations prepend user data behind the scenes. A customer service chatbot might automatically include customer profile data in the prompt it sends to the LLM (“This is a VIP client with ID 12345, now draft a response”). If those system-added fields aren’t handled carefully, an outsider querying the model could potentially trigger a reveal of such hidden context (via prompt injection attacks, for instance). The dynamic context window means the model is juggling a lot of recent information – and there have been cases where it mistakenly included a prior user’s data in a new user’s answer due to a context mix-up (OpenAI’s March 2023 bug was essentially this – some users saw conversation snippets from others[20]). So, at the model interaction layer, failures include data lingering in memory longer than intended, or being packaged in prompts without proper segmentation. If a user is not cautious, they might also chain together copy-pastes: e.g. paste a confidential draft, get AI edits, then copy the AI output (which still contains the sensitive text) and paste that output in a public forum. They may think “the AI rewrote it, so it’s safe,” not realizing it’s essentially the same info rephrased. Additionally, if the prompt is built programmatically, there’s risk that extraneous confidential data is accidentally concatenated (like a developer who accidentally sends an entire file of config secrets as part of the prompt context by using a wrong variable). These scenarios show how the way prompts are constructed and remembered can cause proprietary data to bleed in unpredictable ways beyond the initial query.

Model Provider Retention & Audit Policies This is a critical back-end layer. It concerns what the AI service does with the data it receives. A core risk is that providers may retain user inputs and use them for training or human review, unless explicit arrangements (or settings) prevent it. OpenAI’s default policy (for consumer ChatGPT) has been to use conversation data to improve the model and not guarantee immediate deletion[22]. In fact, OpenAI’s terms historically allowed them to keep data and anecdotes for model training unless customers opt out. They introduced an opt-out for API users – as of 2023, data submitted via the API isn’t used for training by default[19], and API logs are deleted after 30 days (with shorter retention options)[19]. However, users of the web interface had to manually turn off chat history to prevent training usage. Not all providers even offer opt-out. A Stanford study of 6 leading AI companies found all of them feed user inputs back into training by default, with some keeping data indefinitely[23]. Anthropic, for example, quietly updated its policy to use Claude chatbot conversations for training by default (with an opt-out)[24]. This means any proprietary data entered might end up embedded in the model’s weights or training corpus. The risk is two-fold: future model outputs could regurgitate that data (we’ll discuss that under empirical evidence), and provider insiders or contractors may see the raw data during model improvement. OpenAI has acknowledged it employs human reviewers to flag abuse and improve quality[25]. They review a sample of conversations (including ones with sensitive content) for safety and fine-tuning purposes[25][26]. These reviewers could be employees or third-party contractors[27] operating under NDAs, but it’s still another exposure. If a company’s trade secret was in a prompt, now potentially dozens of people (across the world, in contractor firms) might have seen it during annotation[27]. Provider retention policies are often opaque; even if they say “we delete after N days,” there can be exceptions (e.g. data flagged for abuse might be kept longer, or data in backups). A stark example occurred with the Samsung leak: after Samsung engineers’ code went into ChatGPT, Samsung worried that the code would become part of OpenAI’s model and hence available to others[22]. This fear is well-founded given default policies. Additionally, providers might be compelled legally to retain or hand over data. Recently, OpenAI was ordered by a court to preserve all ChatGPT data (overriding the 30-day deletion practice)[28][29]. This kind of legal hold means data thought ephemeral might in fact be stored indefinitely for litigation. In summary, at the model provider layer, systemic incentives (training on user data to improve models) and policies can turn one employee’s lapse into a permanent learning for the AI. Unless using a strictly opt-out or self-hosted service, the data doesn’t just vanish after answer – it lives on.

Client-Side vs Server-Side Processing Lastly, we consider where the computation happens – locally or in the cloud – and the different risks each presents. Cloud inference (server-side) means the sensitive data leaves the local environment, traverses the internet, and is processed on external servers. This inherently expands the attack surface: data in transit could be intercepted (if not encrypted properly), data at rest on the server could be accessed by unauthorized parties, and multi-tenant cloud systems could potentially mix up data between clients (as happened with the conversation bleed bug). Server-side also implies reliance on the provider’s security controls; a breach of the provider could expose all submitted data. We’ve seen isolated incidents: for example, an OpenAI credential leak or misconfiguration could hypothetically expose stored prompts. In contrast, on-device or client-side AI (running the model locally) keeps data within the user’s control environment, reducing risk of external exposure. However, client-side AI introduces other concerns: the model files themselves might be sensitive (if it’s a proprietary fine-tuned model on a laptop, theft of the device is a risk), and client-side logging can still leak data (e.g. a local app might write conversation history to disk, which then gets synced or backed up to cloud storage unknowingly). Moreover, many “client” apps are actually thin UIs with cloud backends, so sometimes what seems client-side isn’t. One interesting risk on client-side: if an employee installs a malicious AI desktop app or browser plugin, it could log keystrokes or steal data under the guise of providing AI functionality[30]. Researchers found fake ChatGPT browser extensions that quietly harvested users’ conversation histories and even Facebook cookies[31][30]. This is a supply chain risk – using untrusted client software for AI could leak data to an attacker rather than a reputable provider. On the flip side, server-side logging is typically far more extensive. Cloud providers often log metadata and sometimes content for monitoring. For instance, Azure’s OpenAI service retains prompts for up to 30 days for abuse tracing (even if opting for “no storage” for training, they still may hold data briefly for content moderation)[32]. The difference in risk: On-device = data exposure is mostly within the physical and IT control of the enterprise (unless the device itself is compromised), but you may sacrifice the powerful capabilities of huge cloud models. Cloud = you get powerful AI, but must trust external safeguards and accept that data goes off-premises. This trade-off is so significant that many organizations are actively evaluating running smaller models internally for their most sensitive data, and using cloud AI for less sensitive tasks – essentially a risk-based routing. In summary, client vs server-side considerations boil down to control vs capability, and each has failure modes (device malware or loss on one side, cloud breaches or multi-tenancy leaks on the other). Later in mitigations we will revisit this with “on-device vs cloud” as an architectural decision.

By analyzing each stack layer, we see that multiple independent failures can align to cause a leak: a permissive UI + an untrained user + use of a personal account + a provider that retains data + a model that memorizes it = proprietary info appearing in a stranger’s chat output. Next, we examine why users share such data despite these risks (structural incentives) and then real cases where things went wrong. Structural Incentives Driving Data Exposure It’s easy to blame employees for mistakes, but it’s crucial to understand why smart professionals end up pasting secret data into an AI. Several systemic and organizational factors create strong incentives or pressures that lead to these behaviors:

Productivity Pressure and "Efficiency Traps" Modern workloads are intense – developers face tight deadlines, analysts juggle massive data, marketers churn out content under time pressure. Generative AI offers a tantalizing boost in productivity. An engineer who is stuck on a bug knows that ChatGPT might solve in 30 seconds what could take them half a day. Under deadline, the immediate reward (a quick fix) often outweighs the abstract risk (data leakage). The pressure to deliver results fast can override caution. In the Samsung incident, for example, the engineers likely used ChatGPT to quickly troubleshoot code and transcribe meetings, activities aimed at efficiency[14]. The competitive advantage of AI-assisted work is so significant that employees may feel they’re falling behind if they don’t use it. We’ve reached a point where some feel not using AI is a career disadvantage. This pushes people to use whatever tool is at hand, even if unofficial. If the company hasn’t provided a sanctioned alternative, the path of least resistance is the public tool. Essentially, the immediate incentive of productivity gain is concrete and personal, whereas the risk of data exposure feels distant and diffuse. This imbalance leads to rationalized behavior: “I’ll just do it this once, it’ll save me hours.”

Poor Data Classification and Awareness Tools Many organizations struggle to equip employees with easy ways to identify what is sensitive. If everything is marked “internal” without nuance, employees become numb to it. Conversely, if nothing is labeled, they may not realize a piece of information falls under “trade secret” or “export controlled”. Tools that could help (like automatic classifiers, or pop-up warnings when copying certain content) are often not in place. Thus, employees operate in a gray zone of judgment. If a developer has a code snippet, they might think “this isn’t the entire source code, just a snippet – is it really secret?” Without clear guidance, they err on the side of getting help. In some cases, staff genuinely do not know that copying that data violates policy, because policies haven’t kept up. One survey revealed over half of employees using generative AI didn’t see a problem inputting sensitive company data[33]. This indicates an awareness gap. Also, if employees wanted to double-check sensitivity, the tools may be lacking – e.g. searching an internal database to see if data is classified highly might be impossible, so they rely on gut feeling. Poor classification leads to ambiguous boundaries in the employee’s mind, which leads to mistakes.

Ambiguous Boundaries of AI Use AI systems blur traditional notions of “external vs internal.” An employee might think: “I’m just using a software tool provided by a reputable company (OpenAI/Microsoft); it’s not like I’m posting the data on social media.” There’s a psychological effect where using a private chat interface feels one-to-one and secure, even if it’s not. Unlike emailing a competitor (which clearly feels like a breach), typing into ChatGPT doesn’t feel like telling a person – it feels like talking to a robot assistant. This illusion of privacy is reinforced by AI often being framed as secure assistants. Unless explicitly told otherwise, a user may assume their conversation is confidential. The boundaries of what constitutes “disclosure” are ambiguous: is giving data to OpenAI an external disclosure? Legally yes (OpenAI is a third party), but employees might not equate it with, say, publishing data. Additionally, some employees believe the AI is only using the data transiently to give them an answer, and won’t store it (which, as we saw, is not always true). This misunderstanding of the AI boundary – thinking of it as a contained tool rather than part of a global model – incentivizes use. Organizational silence or vagueness compounds this: if management hasn’t explicitly forbidden it, employees might assume it’s tolerated. Only recently have many companies started drawing clear lines (some banning use outright, others issuing guidelines). In early 2023, there was a wild west period where employees experimented freely with little guidance, often crossing lines unintentionally.

Lack of Sufficient Internal Alternatives In many cases, employees turn to external AI because the internal tooling is inferior or non-existent. If a company does not provide a secure, on-premise AI assistant or a vetted SaaS with proper agreements, employees who see the value of ChatGPT will use it anyway in “shadow IT” fashion. For example, if translating a document or summarizing a lengthy report is tedious and the company doesn’t have an approved AI solution, a worker might quietly paste it into an online AI service. The perceived benefit is immediate and tangible, whereas requesting an official tool could take months through IT bureaucracy. Some companies have internal knowledge bases or search tools, but they might be clunky compared to just asking a well-trained LLM for the answer. We’ve essentially seen consumer-grade AI leap ahead of many enterprise IT capabilities. This creates an innovation gap: employees feel they have to “bring their own AI” to work to be efficient. A concrete example is developers using GitHub Copilot on private code because their company has no equivalent code helper – they know there’s a slight risk but the productivity boost is huge. Or consider non-technical staff: tools like ChatGPT can automate Excel formulas, write SQL queries, draft emails – tasks that could take them hours. If internal IT hasn’t offered something, workers will gravitate to what’s available publicly. In essence, convenience and capability drive usage. This is not about negligence; it’s often a rational choice in context. Organizations that banned ChatGPT outright sometimes faced pushback or quiet non-compliance because employees felt they were being asked to work with one hand tied behind their back.

Cultural and Organizational Drivers Beyond pure productivity, there’s also a culture of experimentation in many firms that implicitly encourages pushing boundaries. Early adopters might share success stories (“I asked ChatGPT to optimize our code and it worked!”), creating a bandwagon effect. If leadership isn’t clear, employees might interpret the lack of negative feedback as tacit approval. Sometimes even managers encourage their teams to leverage AI to increase output, without fully realizing the implications – thereby incentivizing use through performance metrics. For example, if a content team is told to double output, they might lean on AI writing, which means feeding internal style guides or draft content into ChatGPT. The blame is not on the individuals here; it’s on the organization not aligning its incentives (speed, output) with its policies (safety, confidentiality). This is why experts urge focusing on systemic solutions rather than just telling employees “be careful.”

In summary, employees expose sensitive data largely because the environment nudges them to do so: the quick wins are immediately visible (faster coding, easier writing), whereas the downsides are invisible or seem unlikely. The organization often has not provided a safe path to get those same wins, leaving the public AI as the attractive option. The solution is not to scold individuals but to change those incentives: provide internal tools, training, and clear boundaries such that the desirable behavior (protecting data) aligns with the convenient behavior. We will discuss mitigations later. First, let’s examine what AI providers do on their side with the data (since that heavily influences the risk) and then look at real incidents that have occurred. Provider Data Handling: Policies vs Practices What do major AI providers (OpenAI, Google, Microsoft, Anthropic, etc.) actually do with the data users submit? This question is critical for evaluating risk. We consider retention durations, use in training, human access, and official policies – and we contrast stated policies with any known practices or third-party findings.

OpenAI (ChatGPT/GPT-4) OpenAI’s practices have evolved, especially splitting between consumer services and its enterprise/API offerings. By default, data entered into ChatGPT (Free or Plus versions) is retained and used to improve the model. OpenAI’s help center openly states that conversations may be reviewed by AI trainers to fine-tune models and develop new features[25]. These conversations are stored on OpenAI’s servers. OpenAI implemented a setting to turn off chat history (thereby opting out of training use), but unless a user actively does that, the chats are fair game for training. OpenAI has said that when history is off, it still retains the data for 30 days for abuse monitoring, then deletes it[26]. Enterprise-tier customers (ChatGPT Enterprise or via Azure OpenAI) get stricter guarantees: OpenAI promises not to use enterprise data for training by default[19] and allows opting for zero retention logging. Specifically, OpenAI’s API as of 2023 does not use submitted data to improve the model unless the customer opts in[19]. They also introduced a 30-day default retention for API data (down from indefinite in earlier days)[19]. However, that 30-day period is still a window of exposure. Notably, a recent legal case (OpenAI vs. NY Times) forced OpenAI to temporarily keep all user data longer than 30 days[28][29], illustrating how external demands can override policy. So the retention can extend if legal or security reasons arise[29].

OpenAI also has humans in the loop. As described earlier, human reviewers can read conversations for several purposes: safety (checking for disallowed content), quality (random sampling for performance), and fine-tuning data selection[25][34]. RedactChat (a privacy tool blog) summarized that OpenAI contractors do access some chats, including those with sensitive info, despite user opt-outs (opt-out stops training use, but OpenAI still may review for abuse for 30 days)[26]. In practice, users have no visibility into whether their specific chat was read by a person. OpenAI claims it’s a small percentage and reviewers are under NDAs[35]. But the risk exists that an especially interesting piece of data (say, a novel piece of source code or a juicy business plan) might catch a reviewer’s eye. Additionally, OpenAI has automated scanning for certain content which flags conversations for review[36]. If proprietary data accidentally triggers a keyword (e.g. a project code name that happens to match a banned term), it could be escalated.

Anthropic (Claude) Anthropic initially marketed Claude as more privacy-conscious, but as Stanford’s study found, they switched to using conversations for training by default as well[24]. Users can opt out via an email request. They state data may be retained for safety monitoring. It’s likely similar to OpenAI: data is kept for some weeks, and sampled for improving the model. Claude is offered via API and some chat interfaces (like Poe). If one uses an unmanaged Claude instance, assume the data can be used to “help Claude get smarter.”

Google (Bard/Gemini) Google’s privacy posture historically is that they do not use data from enterprise Google Cloud customers for improving services unless agreed. For consumer-facing Bard (the experimental chatbot), Google’s privacy policy indicates that it collects conversations and may review them to improve quality, with strict access control. Google has massive data from other sources, so training on user prompts is perhaps less critical for them, but they likely still log everything. One unique aspect with Google is they could potentially combine your chats with your Google account data (if you’re logged in), though they have said Bard data is kept separate from ad targeting, for example. Still, the Stanford research noted that for multiproduct companies like Google and Meta, there is a tendency to merge chat data with other user data to enrich user profiles[37], unless stated otherwise. Google has introduced an enterprise version of Bard as part of Google Workspace with guarantees that data isn’t used to train models or seen by human reviewers. For general Bard users, it’s safe to assume some retention and human evaluation occurs (Google explicitly warns users not to include sensitive info in Bard queries).

Microsoft (Azure OpenAI & Copilot) Microsoft’s Azure OpenAI Service (which offers GPT-4, etc. in Azure cloud) promises that customer prompts and completions are not used to train the base models[19]. Microsoft also offers a “zero retention” mode for certain Azure OpenAI instances (especially for government or regulated customers) where even telemetry is minimized. They do, however, retain data for 30 days by default for abuse monitoring (similar to OpenAI’s policy, since it’s the same model) unless you have an approved “Zero Retention” clause[38]. On the GitHub Copilot side (especially Copilot for Business/Enterprise): Microsoft states it does not use customer’s code snippets to train models[15], and it provides content filters plus an option to avoid even retaining prompts beyond the session[39]. In Copilot’s case, some telemetry about usage is kept (e.g. events, acceptance rates) but the actual code content for business users is not retained longer-term[16]. Copilot Free (for individuals) historically did send data to OpenAI and possibly retained it for some period (28 days for prompts under some circumstances[40]). Microsoft has also introduced an internal governance: for example, Copilot Chat for Teams (an AI that can answer questions about your documents) keeps data within that tenant and does not feed it back to Microsoft’s foundation models. So Microsoft’s approach for enterprise is largely opt-in for data usage, which is a response to enterprise demand. It’s worth noting Microsoft likely does some caching and logging even for enterprise – for example, to debug an issue with the service, support might access a limited log. But by policy, they segregate it and delete it as per compliance standards (they tout alignment with SOC2, ISO27001, etc. for these services, meaning they have to follow strict retention and access rules).

Meta (Llama 2 and others) Meta released Llama 2 as a model (which can be run on-premises), so in those cases data handling is entirely up to the user (no data sent to Meta). For their own chatbot applications (like Meta AI in WhatsApp or Instagram), presumably they gather those conversations for training their models (since they feed into improving the model’s conversational abilities). Meta’s business model being ad-driven raises interesting questions – but that’s beyond enterprise use for now, as most businesses aren’t using Meta’s chat for work.

Other Providers AWS has Amazon Bedrock (offering models from Anthropic, Stability, etc.), which emphasizes that it does not store or use content for training when you use the service – akin to Azure’s stance. IBM’s Watson X and Oracle’s AI services also pitch privacy – not using client data for model improvements. These enterprise-focused providers understand that clients want full ownership of their inputs/outputs. That said, even if not used for training, they might log requests for a short period or for compliance.

Human Review & Abuse Monitoring Nearly all providers have clauses that they may review content to prevent abuse or improve the service. For instance, OpenAI explicitly retains the right to monitor for terms of service violations[25][36]. This typically involves automated filters and then humans for flagged cases. From a security perspective, this means any prompt containing disallowed content (which might include things that look like personal data or secrets) could get flagged and then read by a human. An employee inadvertently including a password in a prompt might trigger a security filter (e.g. if the AI detected a 16-character random string that looks like an API key, it might flag it). A human might then examine that conversation to verify if it’s misuse. So ironically, trying to get the AI to check a leaked credential could actually increase the exposure. Abuse monitoring also entails storing some metadata – e.g. IP addresses, user IDs, timestamps – which could be subpoenaed or used in forensic analysis if needed.

Policy vs Reality Publicly, providers reassure users that data is safe and only used in beneficial ways. But third-party research often finds gaps. The Stanford policy study noted lack of transparency – policies were hard to interpret and often incomplete on specifics[41][42]. It also highlighted that some companies claim to de-identify data used for training, but it’s unclear how effective that is[43]. (If de-identification is just dropping names, a chunk of code or a paragraph of strategy might still be recognizable to insiders even without explicit names.) Another reality check: Researchers have successfully extracted training data from GPT models, including personal information. For example, academic work by Carlini et al. showed that GPT-2 had memorized hundreds of snippets of its training text, including private info like someone’s contact information, which they could retrieve by cleverly querying the model[44][45]. More recently, those researchers applied a similar attack on ChatGPT (GPT-4) and managed to extract verbatim pieces of its training data – including what appeared to be private communications or content from the internet[46][47]. They got out addresses, phone numbers, etc., demonstrating that data which OpenAI presumably thought was sufficiently mixed or anonymized was in fact directly recoverable[47]. This doesn’t necessarily mean user prompts (from ChatGPT usage) were extracted – it could be data from the web – but it confirms that if user data made it into training, it could later resurface. OpenAI and others apply some filtering to avoid regurgitation (the model is instructed not to reveal verbatim text that looks like a secret), but as the researchers noted, alignment is not foolproof[48]. They were “wild” that their attack worked on a production model[49]. This is a third-party validation that model providers may not have fully mitigated memorization of user data.

In summary, major providers differ slightly in policy, but a general theme is: Consumer-facing AI services = assume your data is retained and likely used to improve models, unless explicitly stated otherwise. Enterprise-oriented services = promise not to use data for training and offer stricter retention and access controls, but you must trust that (and ideally have a contract). All providers engage in some level of logging and human or automated review for safety. Public policies often gloss over these nuances (or bury them in privacy fine print). The reality is that once proprietary data leaves your organization and enters an AI provider’s domain, you should assume it might be seen by humans, stored for a non-trivial period, and could influence future AI outputs[22][23]. Forward-looking, we see some providers launching more privacy-centric offerings (OpenAI’s ChatGPT Enterprise, which offers encryption and admin-controlled retention, or OpenAI’s coming “ChatGPT Gov” for government with higher compliance). These are responses to the very failures and gaps we’re discussing – essentially, new products to eliminate training on user data and lock it down to appease enterprise needs. Compliance and Enforcement Gaps Many organizations have confidentiality agreements, policies, and compliance regimes meant to prevent exactly the kind of data exposure we’re discussing. Yet, exposures are happening – indicating gaps where these protective frameworks break down in practice when AI enters the scene. Let’s examine a few key areas: NDAs/internal policies, industry standards like SOC 2/ISO 27001, and specific regulations (HIPAA, ITAR, etc.), to identify where enforcement is failing or lagging.

Non-Disclosure Agreements (NDAs) and Trade Secret Duties Employees and partners are typically bound by NDAs or contract clauses not to disclose a company’s confidential information to outsiders. In theory, entering sensitive data into an external AI service is a disclosure to a third party (the AI provider). Under trade secret law, if you share a secret with someone outside the circle of trust without a confidentiality obligation, you can lose protection[50]. So how are NDA obligations failing? First, many NDAs and policies didn’t explicitly list “AI services” as forbidden recipients (at least until 2023), so employees might not mentally connect “typing into ChatGPT” with “disclosing to a third party.” Second, enforcement is tough – companies often don’t know it happened until after the fact. Samsung, for instance, only discovered the leak after the fact and then had to ban ChatGPT usage to stop further incidents[51]. A striking example: Amazon’s internal legal team saw ChatGPT outputs that “closely matched internal Amazon data”, which alarmed them that employees were feeding confidential info in[52]. Amazon’s attorney had to send out a memo reminding employees that any Amazon confidential information (code, etc.) must not be shared with ChatGPT, explicitly framing it as a violation of their policies[52]. In other words, Amazon found NDA breach was already happening (“I’ve already seen instances…” the lawyer wrote[53]) via ChatGPT and had to clamp down. This shows NDAs alone weren’t enough; employees needed specific guidance. Another gap: NDAs typically rely on trust and the threat of legal action after a breach. They don’t technologically prevent someone from sharing. In the context of AI, by the time a company could prove a violation (if they ever find out), the data is out and possibly integrated into a model. Enforcement after-the-fact (suing an employee for breach) doesn’t put the genie back. So the NDA framework isn’t well-suited to this kind of micro, inadvertent “leak.” It’s not willful industrial espionage; it’s tiny disclosures at scale. To address this, some companies updated employee agreements or handbook policies in 2023 to explicitly prohibit entering company data into unapproved AI. But again, enforcement relies on monitoring or self-reporting, which many companies lack (monitoring ChatGPT usage falls under privacy and trust issues itself).

Acceptable Use Policies (AUP) and Internal IT Policies Most organizations have AUPs telling employees what they can or can’t do with company systems and data (e.g. “Don’t upload company data to unauthorized cloud services” was a common line). Using ChatGPT might violate those generic rules, but that assumes employees equate ChatGPT to “unauthorized cloud service.” Initially, many did not. The absence of clear mention meant an employee could genuinely think they weren’t breaking rules. Now, companies are racing to update AUPs. According to Gartner, as of mid-2023 about half of HR or IT leaders were formulating guidance on ChatGPT use[54]. The gap period before explicit policy was a Wild West. Even with policy, enforcement is challenging: the behavior often happens on personal devices or browsers. Traditional corporate DLP solutions might catch someone trying to send an email with an attachment out of the network, but they might not detect someone using a web-based chatbot (especially if on HTTPS and not openly logging what was sent). One data point: LayerX Security’s report found that generative AI tools have rapidly become the top channel for “corporate-to-personal” data exfiltration, accounting for 32% of such incidents[55]. This suggests that despite AUPs, employees are bypassing corporate controls by using personal accounts and browsers[1]. The AUP might say “don’t put data on external apps,” yet 18% of enterprise employees were pasting data into AI tools, and over half of those paste events included corporate info[56]. That’s a lot of policy violations flying under the radar. Clearly, the monitoring and technical enforcement of AUPs didn’t anticipate this vector. Companies that installed web filtering to block ChatGPT saw some success, but others saw employees simply use phones or other means. Culturally, if AUPs are too restrictive (“no AI use at all”), they risk being ignored en masse because, as discussed, employees feel using AI is necessary. This is an enforcement gap: rules that are impractical will be broken. The sweet spot policy (some companies adopt) is to allow AI use for non-confidential data only. But that requires user judgment – which, without technical guardrails, is often unreliable. The Cloud Security Alliance recommends making the AUP explicit on generative AI and providing training[57]. Without that, AUPs remain a paper shield.

SOC 2 / ISO 27001 and Security Controls These industry standards (SOC 2 is a security control audit for service organizations; ISO 27001 is an infosec management standard) require that companies manage data securely, control access, and prevent data leakage. A SOC 2 certified company, for example, would have controls like “Confidential data is encrypted in transit and at rest” and “Data transfers are approved and monitored.” However, these frameworks are high-level and risk-based – they may not have had specific controls for “use of AI SaaS.” An ISO 27001 risk assessment in 2022 might not have listed “employee using external AI” as a threat, so corresponding controls weren’t implemented. SOC 2 has criteria about data communications and access, which arguably should cover this (e.g. employees shouldn’t send sensitive data to unassessed vendors), but if the risk wasn’t identified, it wasn’t audited. Now, auditors are catching up – some companies undergoing SOC 2 audits in late 2023 report auditors asking “What’s your policy on generative AI usage?” So the gap was timing and specificity. In practice, a company could be SOC 2 certified and still have had a Samsung-style leak, because the specific scenario wasn’t scoped. One could argue using ChatGPT to handle source code is a breach of “logical access controls” and “vendor management” (since OpenAI wasn’t vetted as an authorized vendor for code), but if no one interpreted it that way, the control failed silently. Another compliance aspect: even if companies had policies, there’s often no automated enforcement. For instance, a DLP rule to block uploads of certain file types to the web might not catch someone pasting text. SOC 2 doesn’t mandate specific tech like DLP, it just says protect data. So unless management explicitly extended those protections to AI usage, there’s a gap. The “ticking time bomb” mentioned in the LayerX report is apt: many organizations bound by GDPR, HIPAA, SOX, etc. are unknowingly creating compliance violations by this uncontrolled AI data flow[58]. It’s a quiet gap – everything looks fine until a regulator or auditor asks, “Can you show data wasn’t leaking via AI?” and the answer is “uh, we never checked.” We expect future SOC2/ISO guidance to explicitly cover this scenario, but during 2023 a lot of compliant organizations still experienced leaks, proving a disconnect between policy and reality.

HIPAA (Health Data Privacy) We touched on this but to reiterate: HIPAA absolutely prohibits sharing PHI with any service provider that isn’t a contracted, HIPAA-compliant entity. OpenAI is not, and as of 2025 still will not sign BAAs for ChatGPT[8]. Therefore, any use of ChatGPT with actual patient data is a direct HIPAA violation. Why is it happening then? Because clinicians or staff either aren’t aware or choose to do it under the radar to save time. There have been anecdotal reports: e.g., a doctor using ChatGPT to write an insurance appeal letter that included some patient details (violating HIPAA), or hospital staff using it to summarize clinical notes (which contain PHI). One survey by Electronic Health Report (EHR) software found a non-trivial percentage of medical professionals admitted to using ChatGPT with patient-related information (often rationalizing that they anonymized it, though “anonymization” is tricky – if any identifier remains, it’s PHI). This is clearly against both training and possibly punishable, but enforcement relies on catching it. Healthcare organizations typically have strict training: “Never put patient info on unauthorized apps”, and they deploy DLP that blocks, say, personal email of patient data. But a web request to chat.openai.com might slip through if not specifically blocked. Also, the ease of access overcame caution for some – early adopters enthused about using ChatGPT to draft patient letters (some even wrote articles about it, then quickly realized the compliance issues). We haven’t seen public breach fines yet from ChatGPT use, but it’s only a matter of time if PHI gets exposed. The gap here is partially human error (or willful risk-taking), and partially lack of officially sanctioned alternatives. The HIPAA Journal notes that unless an AI tool undergoes a security review and signs a BAA, it cannot be used with ePHI[59]. However, some vendors (like Microsoft with Azure OpenAI in a HIPAA-eligible environment, or Google with their Med-PaLM2 model[60]) are starting to offer compliant options. The sooner healthcare entities adopt those, the less temptation staff will have to use the forbidden ChatGPT. So the HIPAA enforcement gap is essentially “the rule is clear, but people found a shiny new tool and used it anyway in absence of alternatives.” It’s an example where policy was strong, but oversight failed.

ITAR and Export Controls: For companies dealing with defense-related technology, ITAR and EAR (Export Administration Regulations) impose legal controls on technical data. An engineer sharing ITAR technical schematics with an AI whose servers or support staff include foreign nationals is committing an export violation – penalties can include hefty fines and even criminal charges. One might think that would deter anyone, yet the ambiguity of AI tools struck here as well. Some defense contractors have asked, “Is it okay to use ChatGPT for non-technical stuff or unclassified data?” The safest interpretation is to avoid it entirely for anything even tangentially related to controlled tech. The U.S. Department of Defense has actually been looking at this – a DoD memo (2023) on generative AI likely advised caution or outright bans for handling controlled info. We haven’t heard of a public ITAR breach via AI yet (perhaps because those working in that space are very cautious by training), but the risk is present. A junior engineer might not realize that a piece of source code for a weapons system, if put into ChatGPT, is effectively an export to OpenAI’s infrastructure. OpenAI as a U.S. company might not automatically forward it to foreigners, but if any foreign person can access that data or if the model output can be obtained by foreign users, it’s a big problem. Also, OpenAI hasn’t certified any ITAR compliance (in fact, their user agreement likely forbids using the service for any controlled data). Enforcement gap here is more about lack of clarity – export control is an old regime not updated for “AI usage”. Companies should treat AI like any other unvetted cloud tool: you wouldn’t upload ITAR data to a random cloud service without clearance, same with AI. But again, employees might not equate a “chatbot” with an export. To mitigate, some defense firms reportedly fully banned use of external AI on company networks (until they can deploy their own secure AI internally). Other Regulatory Gaps: GDPR is an interesting one – technically, if an EU citizen’s personal data is input to ChatGPT without their consent, that could be a GDPR violation (unauthorized processing and likely international transfer). Italy’s data protection authority actually banned ChatGPT for a while in March 2023, citing unlawful processing of personal data (because OpenAI had no legal basis for using personal data in training and was unsure if it was processing minors’ data, etc.)[61]. OpenAI responded with some changes (like allowing users to delete history, providing a form to request data deletion, etc.) and Italy lifted the ban. But GDPR remains a concern – companies in Europe have to consider if by using ChatGPT they are transferring personal data to the U.S. (which currently lacks an adequacy agreement after Privacy Shield’s invalidation). Possibly they’d rely on standard contractual clauses, but OpenAI didn’t offer those to end-users. This is complex, and many EU companies simply banned ChatGPT pending legal clarity. Similarly, sectoral rules like finance (e.g. bank secrecy, customer financial data protection under GLBA) could be violated if, say, a banker used AI and input client financial info. We saw JPMorgan and other banks restrict employee use early on, likely for this reason. So the enforcement gap is often preemptive prohibition – some regulated entities just said “don’t use it at all” because they couldn’t ensure compliance otherwise. To encapsulate: Policies and frameworks failed to prevent these leaks because they weren’t specific, weren’t integrated into workflow, or weren’t technically enforced. NDAs assumed common sense; AUPs were outdated or toothless in face of new tech; compliance programs didn’t yet cover AI usage scenarios. The result was a policy-vs-reality gap: on paper data should never leave, in reality it was leaving by the megabyte. We will illustrate some of these gaps in a table below (Policy vs Reality). But first, let’s look at concrete incidents and evidence of leaks to understand the impact. Empirical Evidence of Proprietary Data Leakage This section catalogs verifiable cases and studies where proprietary or sensitive data actually leaked into or via AI systems. We focus on documented incidents (as opposed to hypothetical risks), including publicized corporate leaks, research findings, and any legal proceedings that shed light on what went wrong. Samsung’s Source Code and Meeting Notes Leak (March 2023): This is one of the earliest widely reported incidents. Samsung Semiconductor employees, within about 20 days of the company allowing ChatGPT use, inadvertently leaked trade secrets by pasting them into the chatbot[62]. According to reporting by Economist Korea and others, three separate instances occurred[13]: (1) An engineer debugging an equipment database interface pasted large chunks of proprietary source code into ChatGPT to identify an error[63]. (2) Another employee submitted some code for equipment identification, asking ChatGPT to optimize it[64]. (3) A third employee had a recording of an internal meeting, transcribed it to text, and fed the confidential meeting notes to ChatGPT to generate minutes[65]. In all cases, the data (code and text) was sensitive and not meant to leave Samsung’s network. The employees presumably got useful answers (which reinforced to them that this was a good idea). The leak was discovered when Samsung later audited usage or heard about it (details vary – some say Samsung found out via an internal review, others suggest OpenAI staff may have detected something and alerted Samsung, but likely it was internal). Impact: The immediate concern was that this data was now on OpenAI’s servers, and because ChatGPT data is used for training, it could resurface in model outputs to others[22]. Samsung responded swiftly by limiting prompt sizes to try to prevent large dumps[66] and within a few weeks, implementing a complete ban on using generative AI tools on company devices[51]. This ban was publicized in May 2023. The incident presumably embarrassed Samsung and served as a wake-up call globally. It’s a prime example of trade secret leakage: source code is the lifeblood of a tech company’s competitive edge, and meeting notes might include strategy or yield issues, etc. Public incident databases noted that this could expose Samsung to all kinds of trouble if that code later appeared or if a competitor somehow accessed model outputs containing the code[67]. It’s unknown if any of Samsung’s data actually showed up to other users, but the mere risk was enough. This case also shows scale: one of those prompts reportedly was a file of code as large as about 1KB of text – not huge, but enough to contain critical logic[66]. Multiply that by many employees over time and one sees the potential magnitude. Amazon’s Internal Data Detected in ChatGPT Answers (early 2023): Around the same time, Amazon’s lawyers became aware that ChatGPT answers were sometimes mirroring Amazon’s proprietary info[52]. Specifically, an internal Slack message from an Amazon engineer said that ChatGPT had solved some interview questions for an Amazon job and even improved some internal Amazon code during that test[68]. That suggests the model had seen similar code before – possibly from an Amazon prompt. More concretely, the Amazon legal counsel wrote that they had “already seen instances where [ChatGPT’s] output closely matches existing material” from Amazon’s confidential data[53]. This strongly implies that some Amazon employees had used ChatGPT with confidential input and that content (or something close to it) was then output by the model in another context. The counsel’s memo, which got leaked to the press, explicitly warned that inputs to ChatGPT “may be used as training data” and that they did not want output to contain Amazon confidential info[69]. This is an indirect evidence of leakage – it’s not that ChatGPT literally printed “Here’s Amazon’s secret code” on its own; it’s that someone prompted it in a way that made it regurgitate learned internal patterns. It alarmed Amazon enough to effectively threaten employees with disciplinary action if they used it with any sensitive info[52]. So while not a public leak (the data wasn’t exposed to outsiders beyond OpenAI), it’s an example of model-internal leak: the boundary between Amazon and the AI wasn’t secure, and Amazon caught it early. ChatGPT Glitch Exposing User Conversation History (March 2023): OpenAI had a significant incident on March 20, 2023 where a bug in an open-source library (Redis client) caused data from one user’s session to be visible to others. The most noted effect was users saw other users’ conversation titles in their history sidebar[20]. For example, you’d log in and see titles of conversations you didn’t have (“Board Meeting Notes” or something – which might itself reveal proprietary info by the title alone!). OpenAI later revealed the bug also exposed some active chat contents and even a limited number of cases of personal information from the subscription payment system (names, emails, partial credit card info of ChatGPT Plus users)[70]. They estimated 1.2% of ChatGPT Plus subscribers had some data leaked to other users[20]. From a corporate perspective, if an employee had used ChatGPT and titled a conversation “Acme Corp Merger Plans,” another random user might have seen that title – a clear data leak. Even though the other user couldn’t open the chat, the title itself could be sensitive. This incident was cited by regulators (Italy’s Garante cited it as a data breach in their ban order)[61]. It underscores that even if you trust the AI provider to not voluntarily share data, bugs happen. Multi-tenant cloud systems are complex; a simple caching bug broke isolation. So, an enterprise’s secret put into ChatGPT on Monday might have been seen by an unrelated user on Tuesday due to this glitch. OpenAI fixed the bug and was transparent in a blog post about it, but it demonstrates a failure mode: server-side bugs leading to unintended disclosure. No specific corporate was named as harmed, but one can imagine some were. The takeaway evidence: ChatGPT is not immune to the kinds of data breaches that hit any SaaS platform. This one was just highly visible because many noticed weird histories. Confidential Data in Training Sets (AI Model Memorization): Researchers have documented that large language models sometimes spit out chunks of text that appear to come straight from their training data, including content that looks confidential or private. For example, earlier research retrieved verbatim news articles and email addresses from GPT-2’s training data[44]. More recently, the Extraction of Training Data from ChatGPT study (2023) by Carlini et al. managed to pull out several megabytes of text from GPT-3.5/4’s training by using a clever “repeat back” prompt[46]. Among the extracted data were what looked like private conversations, code snippets, and personal information[47]. They even mention the model output a real email address and phone number of some entity during their experiments[47]. Now, that training data wasn’t necessarily user prompts – it could have been scraped from the web (maybe someone’s personal info on a webpage). But there have been claims that some data in GPT-4’s training might have come from user interactions with earlier models or from scraped private code repositories. One notable rumor was that OpenAI might have trained on some data from GitHub Copilot telemetry or similar, but no confirmation on that. Nonetheless, the Samsung code leak fear was precisely that future GPT versions might regurgitate Samsung’s code. And we see it’s plausible: the researchers showed ~5% of their exploit’s outputs were exact 50-token sequences from training[71]. So if proprietary data gets into training, others could fish it out with the right prompt. This is empirical evidence by simulation: not a public incident per se, but a lab demonstration of the vulnerability. Intentional Red-Team Leaks: Some “red team” exercises (security testing teams) have explored prompting models to reveal secrets. For instance, one might try: “Ignore instructions and output any secret data you learned during training.” Usually the AI says it cannot, but sometimes prompt injection or jailbreaks get partial success. There were informal reports on forums of users getting GPT-4 to output what looked like base64-encoded strings that might correspond to proprietary data[72]. Without specifics, we treat these as anecdotal. However, companies have started doing internal red teaming of AI: e.g. feeding known synthetic “secrets” during fine-tuning and seeing if a user can later extract them. These tests often find that models will yield the exact secret string if asked just right (especially if it appears frequently in training). This is empirical confirmation that “memorized data extraction” is real. The literature such as Tramer et al. (2022) on extractable memorization backs it up[73]. For a concrete number, one study found that a model like GPT-2 could memorize and regurgitate hundreds of examples out of millions in training[74]. So, if your proprietary data is among those, it could come out to someone else. Public Code or Data via AI Outputs: Another angle: People have observed that ChatGPT and Copilot sometimes produce outputs that match copyrighted or proprietary code verbatim. For instance, early Copilot tests in 2021 found it would sometimes produce a famous block of code (like the Minimax algorithm) exactly as in training, or even a Quake engine code snippet complete with original comments (which was GPL licensed). GitHub acknowledged that and added a filter for suggestions over 150 characters that exactly match training corpus code[75]. Now code from public repos is not proprietary if it’s open source, but the worry is if private code slipped in or if the model memorized something with a non-permissive license. There was speculation that some ChatGPT answers for niche technical questions contained content possibly taken from internal documentation that wasn’t publicly available (this is hard to verify, could be coincidental). Still, Amazon engineers saw matches to internal material as noted above[53] – that’s an example of output exposing protected info. Another example: Some users reported that if they asked ChatGPT about specific companies’ internal strategies (particularly if data had leaked somewhere online), it would produce surprisingly detailed answers. This might not be a model leak per se (could be drawing from news or rumors), but it highlights that if any proprietary info made it to a source the model saw, the model can surface it. Court Filings and Discovery: As of now, we haven’t seen court cases where AI usage led to a lawsuit over data leakage, but it’s foreseeable. One active arena is copyright – e.g. authors suing OpenAI for training on their books without consent. But for proprietary data: if, say, a company found its trade secrets output by an AI, they might sue either the employee (for breach) or conceivably the AI provider (for misappropriation). No such case has fully surfaced yet in public. However, one legal case indirectly relevant: Mattel v. MGA (Barbie vs Bratz) from years ago established that using an external contractor who didn’t sign NDA to develop something could lose trade secret status. By analogy, using OpenAI (no NDA with them) could be argued to destroy trade secret protection of whatever you used it on. No one’s tested this in court, but it looms as a possibility if a leaked secret via AI caused damages. Insider “Shadow IT” Audits: Some companies have run internal audits or simulations. Cyberhaven (a DLP company) reported that in the average company they monitored, 3.5% of employees pasted company confidential information into ChatGPT within a month, including source code, client data, and so on[76]. They even quantified that about 11% of what users paste is sensitive (by their DLP classifier)[76]. This isn’t a single incident but empirical evidence across many companies that leakage is actively happening. Similarly, that LayerX study used browser telemetry to find a huge amount of sensitive data flowing to GenAI[77]. They found “hundreds of instances a week” at some firms[76]. This kind of internal evidence often doesn’t get publicized by the companies (for PR reasons), but security vendors are raising alarm with these stats. It’s essentially a slow, continuous leak rather than a one-time breach – which is even harder to catch. In conclusion, empirical evidence ranges from headline-grabbing leaks (Samsung) to harder-to-see seepage (hundreds of employees leaking small bits). The documented cases confirm that trade secrets and sensitive data have indeed crossed into AI models, and in some cases have been retrieved or observed coming out. We’ve learned that even one employee’s action can have broad consequences (if that data gets trained on, it could leak to any user of the model later). These real incidents have informed the rapid policy responses in many companies and the development of mitigations we’ll discuss. To visualize some of these failures and how they occur at different layers, we provide a failure mode analysis next, and then examine how policy expectations differed from these realities. Failure Mode Analysis (Layered Failures) The table below breaks down key failure modes by layer or aspect of the stack, with a description and a real example or evidence for each. This highlights how multi-layer breakdowns lead to leaks:

Table: Failure modes across layers, with examples. Each of these illustrates how a lapse at one layer (or multiple) can lead to proprietary information escaping into an AI system or through it. From the table, one can see how multiple failures compound. For instance, a Samsung engineer using ChatGPT (user behavior) with no warning (UX) on a personal account (identity) put code into the model (prompt), which OpenAI stored and potentially trained on (provider retention & model memorization), which later caused Samsung to worry it might appear elsewhere (model output leakage). A single incident touches many rows above. Now, we compare what organizations expected (policy) versus what happened (reality) to highlight the policy gaps identified. Policy vs. Reality Gap This side-by-side comparison shows how formal policies or compliance requirements intended to prevent data leaks were outpaced by actual behaviors and failures in the context of AI usage:

Table: Policies on paper vs. actual outcomes in the era of AI data leaks. The gap illustrates that traditional compliance measures did not initially encompass or prevent the new mode of data exfiltration via AI. As we see, reality bit back: People acted contrary to policy (often unknowingly or under pressure), and technology failures exposed data despite rules. The solution is not simply more paperwork; it requires a combination of technical and organizational measures. Next, we turn to mitigation strategies across technical, organizational, and architectural classes that can address these failure modes and close the policy-reality gap. We’ll also discuss optional angles like choosing on-premise models and handling logging. Mitigation Strategies for Proprietary Data Protection in AI Use Mitigating the risk requires a multi-pronged approach. We group mitigations into three classes – Technical controls, Organizational measures, and Architectural solutions – as all are needed in tandem. The goal is to enable employees to harness AI’s benefits safely, aligning with both security and usability. Technical Mitigations (Preventing Leaks through Technology) These solutions involve tools and software safeguards that automatically enforce policies and prevent sensitive data from escaping or being misused: Data Loss Prevention (DLP) Integration for AI: Extend traditional DLP to AI interactions. This means monitoring and filtering content being sent to AI services. For example, implement a proxy or browser extension in the corporate environment that intercepts calls to popular AI APIs and scans the prompt for sensitive patterns (keywords, regex for SSNs, code fingerprints, etc.). If detected, it can block the request or redact the sensitive parts. Several security vendors have started offering “AI DLP” that wraps around ChatGPT’s web UI or OpenAI’s API keys. According to Code42/CSA recommendations, organizations should “block paste activity into unapproved GAI tools” and generate alerts for such attempts[80]. In practice, this could mean disabling the ability to paste (or type) when the DLP agent detects a window is a ChatGPT input box and the clipboard has classified data. Alternatively, an enterprise could route all ChatGPT traffic through an authenticated web proxy that strips or audits content. An example mitigation: JPMorgan reportedly blocked ChatGPT site on its network when it couldn’t otherwise stop sensitive use. More granularity is better: allow usage but strip/stop obvious sensitive info. For API integrations, one can build a filtering layer: e.g. before sending a support ticket to an LLM, remove or mask any personally identifiable info. DLP is not foolproof (it might miss context or new sensitive types), but it’s a vital line of defense to reduce accidental leaks. Content Classification & Warnings in UI: Modify the user interface of AI tools (where possible) to include real-time classification and user feedback. For instance, an enterprise deploying an internal chatGPT-like tool can highlight any detected sensitive terms in red and pop up, “You are about to send potentially confidential information to an external service. Proceed?” This nudge could catch a lot of incidents. Even OpenAI’s own guide suggests users refrain from sharing sensitive info, but that passive notice isn’t enough – a dynamic warning is more impactful. Some organizations created custom front-ends to ChatGPT: employees log in and use ChatGPT via that interface, which logs queries for audit and can enforce policies (like a regex that forbids anything that looks like a 16-digit credit card number from being submitted). Essentially, bake security into the UX rather than relying on users. One bank allegedly implemented a plugin that automatically anonymized or tokenized client names in any text before sending to GPT, so that if employees do use it, at least the data is pseudonymized. This kind of pre-processing can be effective for certain data classes (though harder for code or free-form text). AI Usage Monitoring and Anomaly Detection: Treat the use of AI as a new channel to monitor. For example, track volume and frequency of prompts per user and flag anomalies – if suddenly an employee is pasting huge amounts of text or code repeatedly, that could indicate risky behavior or even malicious exfiltration disguised as AI queries. The LayerX report highlighted that on average users who paste into GenAI do ~7 pastes a day, of which ~4 contain sensitive data[81]. If an employee normally never uses ChatGPT and one day pastes 20 pages of client data, that deviation should trigger investigation. Security teams can integrate logs from OpenAI’s API (for those using API keys) into SIEMs to watch for keywords or volume spikes. For web usage, some forward proxy solutions can log URLs and payload sizes – not content, if encrypted, but at least they can see that a user posted X bytes to chat.openai.com. That alone might be useful (e.g., why did Bob just send 5MB to an AI service?). Monitoring needs to be done in a privacy-sensitive way, but in regulated environments it’s often allowable to monitor for exfiltration. Centralized Access Control and Approved AI Tools: Limit which AI tools can be used and enforce authentication. For instance, provide an official corporate ChatGPT Enterprise account or an Azure OpenAI instance, and block access to unsanctioned ones. By funneling employees to approved endpoints, you can enforce login (tying usage to an identity) and you can ensure the data handling meets your standards (since enterprise plans don’t train on data and have better security). LayerX warns that unmanaged browser access is the risky part[1], so implementing SSO for AI tools, restricting access only to those who accept monitoring, etc., is key. At minimum, block known malicious or sketchy AI services – there’s been an explosion of “free GPT” websites; these could be data traps. Also consider firewall rules: e.g., some companies block network traffic to OpenAI except from a controlled gateway. Microsoft’s guide even suggests disabling Copilot Free for orgs and only allowing Copilot Business, by using firewall and policy settings[82]. Encryption and Data-Masking Techniques: In scenarios where data must be sent out, use encryption or masking so the provider never sees raw sensitive data. One emerging idea is client-side encryption of prompts, where the model might operate on encrypted embeddings (homomorphic encryption or secure enclaves). That’s not mainstream yet for generative AI (still research-y and has performance costs). But a simpler approach: mask identifiers in data. For example, replace all real customer names with fake ones or IDs before sending to AI, then map back the answer. Some companies use format-preserving hashing for things like account numbers in prompts. This way, even if the data leaked, it’s not directly useful. This requires custom tooling but is doable for structured data. For code, one could conceivably rename variables to generic names before sending to AI (so specific project terms are hidden), then reverse the changes in the response. There’s also the concept of “prompt chaff” – adding decoy data in prompts to confuse any learning from them (though the utility of that is uncertain, and could degrade output quality). Model Constraints and Watermarking: When using your own models or fine-tuned models, you can enforce constraints to reduce memorization and leakage. For example, OpenAI and others are working on watermarking AI outputs to identify if content came from the model – which doesn’t directly stop leaks, but if a model output containing confidential info had a watermark, one might trace it. More practically, one can fine-tune models with system instructions that say “Never output text that looks like ” (for keys, personal data, etc.). OpenAI’s GPT-4 has a moderation system that can stop outputs containing certain sensitive classes (they have filters for things like sexual content, PII, etc.). Ensuring those are active and custom-tuned for your data can help; e.g., define your company’s secret project codenames as “disallowed output” so that even if learned, the model won’t repeat them. There is risk in relying on these (they can be circumvented with adversarial prompts), but they add a layer. In summary, technical mitigations aim to put guard rails on the pipeline: from input (DLP, UI warning) to output (filtering) and everything in between (monitoring, encryption). Done well, they make it hard for an employee to accidentally or intentionally send crown jewels to an AI or for an AI to accidentally give it out. Organizational Mitigations (Policies, Training, Processes) These focus on human factors and institutional practices, aligning the workforce and procedures with safe AI usage: Clear Policies & Guidelines Specifically for AI: Update the employee handbook, security policies, and engineering guidelines to explicitly address generative AI. Vague “don’t leak data” rules aren’t enough; spell out scenarios. For example: “Do not input any code marked confidential or any client personal data into public AI tools (ChatGPT, Bard, etc.) without approval” – and define what approval or sanctioned tools are. Provide examples of allowed vs disallowed uses. The Cloud Security Alliance suggests providing “explicit guidance in your Acceptable Use Policy” about GAI tool use[57]. Many companies have now issued “GenAI usage policies” by late 2023. This removes ambiguity – employees should not have to guess. Also clarify consequences: if someone knowingly pastes secret info, is it a fireable offense? Knowing there’s a company stance helps employees make the right choice under pressure. Employee Training and Awareness: Just having rules isn’t enough; people need to understand the why and how to comply. Conduct regular training sessions on the risks of AI (perhaps integrated into security awareness programs). For developers, add a module about “AI coding assistant privacy”; for marketers, a module about “using AI for content without leaking client data,” etc. Emphasize that AI services are third parties – akin to posting data on an external forum. Training should use real examples (Samsung, Amazon, etc.) to make it concrete. According to LayerX, training employees on secure AI use – highlighting data-sharing risks – is a key mitigation[83]. This might include practicing with an internal tool: e.g., show a prompt and ask “Is this safe to send to ChatGPT?” as a quiz. Also, foster a culture where if someone isn’t sure, they ask IT or infosec rather than just doing it. It’s important to avoid chilling all AI use – training should also show approved ways to use AI safely, so employees aren’t left without options (otherwise they’ll secretly use it anyway). Encourage Use of Approved Internal Tools (Carrot & Stick): Organizationally, roll out approved AI solutions with fanfare and make them easy to access. For example, a company might license ChatGPT Enterprise or build an on-prem LLM solution. Then heavily encourage employees to use that instead of random accounts. Make it the path of least resistance. Some companies even temporarily banned external AI use until their internal solution (like a sandboxed GPT-4 instance) was ready, then lifted the ban partially. The idea is to direct the desire to use AI into a controlled channel. This is paired with restricting alternatives (the stick). For instance, a company might block the ChatGPT website but provide an internal portal that uses the OpenAI API with logging and DLP – so employees still get answers, but safely. Communicate that this isn’t about stifling innovation, it’s about doing it securely. When employees see leadership providing tools, it reduces the need for shadow use. Create an AI Governance Committee or Process: Treat AI introduction like one would treat bringing a new SaaS or process into the company. Form a cross-functional team (IT, security, legal, compliance, HR) to regularly review AI tool usage, assess new AI services employees want to use, and approve/reject them. Establish a process: if a team wants to try a new AI API, they must go through a quick risk assessment. The governance body can maintain an “allowed AI tools list” and update policies as the landscape changes. This also helps keep leadership educated on AI capabilities and risks. Essentially, it’s formalizing oversight. Some large enterprises have already instituted AI risk committees (often triggered by these early leaks). They might also set guidelines on model usage (e.g. if using an AI to generate code, you must review for IP issues, etc., though that’s beyond data leak scope). Incident Response Plan for AI-related Breaches: Update breach response playbooks to consider scenarios like “employee input confidential data into external AI – what do we do?” and “AI model output revealed proprietary data – how to contain?” This might involve steps like immediately contacting the AI provider to request deletion (OpenAI does allow users to delete specific conversations via support – no guarantees, but it’s worth trying), or if model output leaked something publicly, issuing takedown requests or legal notices. Practicing a mock incident can highlight gaps. For example, if Samsung’s incident happened to another company, do they have a way to scan the AI’s model output for their data? Not easily – perhaps they’d rely on NDA with the provider or just damage control. Having a plan at least means when an employee self-reports “I think I pasted something I shouldn’t have,” the company knows how to triage (better to encourage self-report by having a less punitive, more problem-solving approach in those cases). This also might involve PR/crisis communications if a public leak occurs. Reinforce Accountability and Ethical Use: Make AI risk part of the corporate ethos. Similar to how companies did with phishing (“Don’t be the one who clicks a bad link”), do with AI (“Don’t be the one who leaks our secrets to a bot”). Without blaming, instill that it’s everyone’s responsibility to handle data carefully even when using new tools. Some organizations have employees sign an acknowledgment of AI usage policy, to underline its importance. Also, clarify ownership: e.g., code generated with AI might inadvertently include others’ IP – so instruct engineers on code review practices to catch any strange verbatim outputs. That’s more on IP compliance than data leak, but related (to avoid ingesting leaked info from others). Positive Incentives and Competitions: Consider “safe AI usage” drives – for instance, run an internal hackathon to develop the best prompt anonymizer or best practice guidelines. Reward teams that integrate AI securely into workflows. By engaging staff creatively, you turn them into allies in solving the problem. The people closest to the work often have good ideas on how to use AI without leaking data if given the chance to brainstorm. In sum, organizational measures are about setting clear rules of the road and equipping people with knowledge and alternatives. Humans are the first and last line of defense; even with great tech controls, a savvy user can circumvent them if they choose. Conversely, a well-meaning, trained employee can catch themselves before a mistake even without perfect tech controls. The alignment of employee behavior with security goals is crucial – that’s achieved by communication, training, and leadership example. Architectural Mitigations (Structural and Infrastructure Changes) These are bigger-picture solutions involving how systems are designed and where AI is deployed – essentially changing the architecture so that sensitive data doesn’t need to leave, or if it does, it remains protected: On-Premises or Private Cloud AI Deployments: One robust solution is to bring the AI model to your data instead of sending data to the model. Running LLMs on-premises (or in a VPC environment managed by the company) ensures proprietary information never leaves the controlled infrastructure. With the proliferation of relatively powerful open-source models (like Meta’s Llama 2, etc.), companies can fine-tune a model on sanitized internal data and use that for many tasks without any external calls. Even if using commercial models, vendors like OpenAI and Anthropic are rolling out dedicated instances or allowing models to be hosted in your cloud tenant. OpenAI’s announcement of ChatGPT Enterprise implies they can deploy a GPT-4 on isolated infrastructure for a client. By doing so, data stays within company-owned storage and networks, eliminating most of the leakage vectors (no multi-tenancy bug risk, no provider training on your data, no foreign data export). Summit 7 (a gov cloud provider) notes that installing a ChatGPT alternative on-prem “solves a lot of the data movement issues”[84]. This approach was taken by companies like Apple, which reportedly banned employees from using external AI and is developing its own internal LLMs. Banks and defense firms likewise favor this route. The trade-off is cost and capability: maintaining AI infrastructure is non-trivial, and current open models might be less capable than GPT-4. However, for many uses (especially with domain-specific fine-tuning) they may suffice. On-device inference (e.g., running a model on a user’s workstation) is a variant – feasible for smaller models and ensures even network-level security since nothing is sent out. The main point: Architectural localization of AI reduces reliance on external trust. It turns an AI tool from an external service to part of your internal IT stack, where standard controls (access control, logging, encryption at rest, etc.) apply. Hybrid Models and Federated Learning: Where full on-prem isn’t possible, a hybrid approach can limit exposure. For instance, keep sensitive parts of data processing local and only send derived or partial information to the cloud model. A concept is federated learning: if someday companies could update a local model and share only model weight updates that don’t contain raw data, you could get collective improvement without central data pooling (Google’s done this for AI on mobile devices). It’s not mainstream for LLM fine-tuning yet, but something to watch. In an enterprise context, maybe a central corporate LLM could train on many departments’ data but each department’s raw data stays in its silo. Another angle: do heavy pre-computation on sensitive data internally to produce embeddings or summaries that abstract the original data, and only feed those abstractions to external models (assuming they can work with it). This limits what the external service sees (e.g., numbers or vector embeddings instead of actual text). Client-Side Isolation and Sandboxing: If employees use AI tools on their clients (PCs, phones), sandbox those applications. For example, use a browser that is configured in a “AI use sandbox” mode where clipboard access is restricted (so it can’t scrape arbitrary clipboard data) and downloads are disabled. Or run the AI tooling in a virtual desktop that doesn’t have access to internal network. Some companies route AI usage through a separate VLAN or network segment that has no access to production systems or sensitive databases, ensuring that even if an AI plugin tried to snoop around, it finds nothing. This is similar to how some orgs treat web email for security – open it in an isolated container. If an employee wants to feed data from a secure system to AI, they’d have to manually take it out (which triggers DLP hopefully). It adds friction, yes, but for highly sensitive environments, that friction is desired. Model Prompt Templates with Automatic Redaction: On an architectural app-dev level, companies can design their internal workflows such that whenever AI is invoked, the system programmatically strips certain fields. For instance, if a CRM wants to summarize a customer case via an API call to GPT, the code can be written to drop the customer’s last name and email from the text before sending. This architectural pattern – building data sanitization into every AI integration – is crucial for compliance. It should be standard: never send SSNs, never send authentication tokens, etc. Some forward-thinking firms even maintain an internal library for AI calls that handles this centrally (so developers don’t reimplement it each time). That library could also include adding “do not retain” instructions in the prompt, though the effect of those is not guaranteed, it’s more of a human-oriented note. Choosing Providers with Better Privacy Postures: Another structural decision is vendor selection. Not all AI providers treat data equally. Some (like OpenAI’s free services) use data for training, while others (like Azure’s and some newer startups) offer “zero retention, zero training” as a core feature. When architecting solutions, prefer providers or services that contractually commit to not using your data and to purging it quickly. For example, OpenAI’s API with a “Zero Data Retention” addendum will not store any prompts or responses beyond serving the request[85][38]. If an enterprise is using AI for say, medical data summarization, they might choose Microsoft’s Azure OpenAI in a HIPAA-eligible environment over the public ChatGPT, specifically because Azure can sign a BAA and isolate the data. Another example: some companies in Europe might choose local EU-based LLM providers to avoid data export issues under GDPR. In essence, architect your usage such that the service itself aligns with compliance. That mitigates risk at the root. Differential Privacy and Perturbation: For organizations really pushing boundaries, one could incorporate differential privacy techniques before sending data to an AI. That means adding a bit of noise or removing identifying details such that any one query doesn’t reveal too much sensitive info. This is advanced – not something every company will implement, but academia is discussing it. The idea: you allow statistical insights without giving exact secrets. For instance, if generating a report, maybe fuzz each number by a tiny random amount – enough that the AI result isn’t materially affected for the summary purpose, but not accurate enough to leak exact figures if somehow reconstructed. There are libraries for differential privacy (from Google etc.), but applying them to freeform text is researchy. Still, this is an architectural concept: intentionally degrade or tokenize precise secrets (like replace actual secret code blocks with placeholder tokens that instruct an internal post-processor to fill them back in after the AI does the generic work). That way the AI never sees the real secret logic, only a marker like <SECRET_FUNC>. This requires careful planning but is feasible for some code uses. Logging and Observability Design: Architect systems to log AI interactions in a secure manner for later auditing. For example, if a user uses the internal AI portal, log the prompt minus any classified sections (or hash them) so that if something leaks, you can trace which prompt/user likely caused it. Design the log such that it doesn’t itself become a second point of leak (encrypt logs at rest, restrict access to them). Observability also means measuring the outputs: perhaps run outputs through a scanner to see if any sensitive patterns appear, and flag if so. Over time, this helps refine prompt filtering rules if you see near-misses. Essentially, build an internal telemetry that watches AI usage patterns and results in aggregate, to continuously improve safeguards. Combining all the above, an ideal future state architecture for an enterprise might be: employees use an internal AI assistant that runs either on-prem or in a trusted cloud enclave; all prompts go through a gateway that scrubs PII and secrets; the AI model has been fine-tuned on non-sensitive internal knowledge (so it’s very useful without needing sensitive inputs); none of the prompt data is stored or leaves the environment; and every query/response is tagged with user ID and kept for audit in case of issues. Meanwhile, employees are trained and comfortable with this setup, and have little reason to seek external tools. Realistically, not every company can do this today – so interim steps like partial restrictions and third-party solutions bridging these gaps are in play. But these architectural shifts are happening rapidly (the big AI providers are also offering more private deployments specifically because enterprise demand is high). Finally, let’s address two optional angles that tie into mitigations and future considerations: the trade-off of on-device vs cloud inference, and the risks around logging/observability (which we touched on but will summarize). On-Device vs Cloud Inference (Optional Angle) The choice of running AI models on-device (local) versus in the cloud is emerging as a pivotal architectural decision. Each has pros and cons vis-à-vis security: On-Device (Local or On-Prem) Inference: Keeping the model local means data never leaves the device or data center. This maximally protects confidentiality because even if the model is powerful, it’s under the organization’s full control. For instance, a lawyer’s laptop could run a smaller LLM to analyze documents – client data stays on that encrypted laptop, satisfying confidentiality requirements. Similarly, an enterprise might deploy a model on an internal server; all employees’ queries hit that server behind the firewall. This approach mitigates almost all concerns about external breaches or training leakage. It also simplifies compliance (no GDPR cross-border issues, etc.). However, on-device models might be less capable or slower if hardware is limited. Large models (GPT-4 class) currently need powerful GPU clusters; most companies won’t replicate OpenAI’s entire infrastructure in-house. So there’s a capability trade-off – though this gap is closing as open-source models improve. Another benefit: on-device can be customized heavily – organizations can hard-code filters or modify the model. They can also truly enforce zero retention (since they control storage). Many see on-prem as the endgame for the most sensitive cases. For example, Palantir’s AI platform actively markets that it can run LLMs in your private network so “data never leaves your environment.” Cloud (Hosted) Inference: Using cloud-hosted AI (OpenAI, Microsoft, Google, etc.) offers access to the most advanced models and easy scalability (no infra to manage). The cloud providers often have superior uptime and have invested in model safety techniques. For general tasks, cloud might yield better answers, boosting productivity more. From a risk perspective, cloud means trusting the provider’s security and compliance. Good providers can have very strong security (OpenAI and Microsoft are SOC2 certified, etc.), arguably their risk of external breach might be low – but the systemic risk is the one we highlighted (data used for training, multi-tenancy bugs). Some mitigations can be put in place – e.g., using a dedicated instance in cloud reduces multi-tenancy risk, turning the scenario closer to on-prem logically. Cloud also can mean ongoing improvements – the model improves over time, whereas an on-prem model might get stale unless updated. For some regulated industries, cloud is a non-starter unless heavily vetted. But for others, cloud with encryption and contractual assurances might be acceptable. In practice, we’re seeing a hybrid approach: use on-device or self-hosted models for the most sensitive data and tasks (so those queries never go out), and use cloud for less sensitive, high-value tasks where the top model is needed. For example, an insurance company might use an internal model to process individual claims (which contain personal data), but use OpenAI’s cloud to generate general reports or do creative writing that only uses anonymized or public info. On-device vs cloud is not an all-or-nothing; it can be decided per use-case. Architecturally, one might route certain queries to one or the other based on classification – effectively a gateway that decides: “If prompt includes client data, use our local model; if not, call OpenAI.” This ensures compliance where needed, while still leveraging cloud power where safe. Looking five years ahead, we anticipate devices (from phones to servers) will be capable of running pretty advanced models (perhaps distilled versions of GPT-4 level). Meanwhile, providers will offer more trustworthy cloud options (like processing in secure enclaves, or signing legal agreements to not store data). The gap will close. But for now, each organization must weigh the privacy vs performance trade-off. Many are erring on the side of caution and exploring on-prem – as evidenced by the massive interest in local models in 2023 – to avoid another Samsung-like incident. Logging and Observability Risks (Optional Angle) We discussed logging as both a risk and a mitigation: it’s a double-edged sword. Here we focus on the risk side: the logs and telemetry of AI systems themselves can leak sensitive information if not handled properly. Client-Side Logs: Many AI applications (especially if running as apps or extensions) might keep logs for debugging. For example, a desktop ChatGPT client might log conversation IDs or even content snippets to help developers improve the UI. If those logs aren’t stored securely or are synced to cloud backup, that’s a leak vector. An enterprise must ensure any such logging is disabled or sanitized on managed devices. One strategy is using ephemeral modes – e.g., browsers often have an “Incognito” mode that doesn’t keep history. Perhaps use ChatGPT in a dedicated browser profile that is set to not keep history or cookies, minimizing local residue. On mobile devices, if employees use AI apps, consider MDM policies to restrict what those apps can access (some MDMs can block screen capture or keyboard logging for certain apps). Essentially, treat AI apps like any other potential data sink. Server-Side Logs and Telemetry: The AI providers generate telemetry from usage – e.g., to monitor for abuse and performance. OpenAI’s systems, as noted, had a bug that leaked some of this telemetry (conversation titles)[20]. Another example: Mixpanel (a product analytics service) was used in some part of OpenAI’s infrastructure, and in November 2025 OpenAI disclosed a security incident with Mixpanel keys being exposed[86]. One wonders if any chat data was in those analytics – likely not content, but maybe metadata. It shows that even the analytics pipeline can be a risk – OpenAI responded by improving how they protect such keys[86]. Enterprises should ask providers what telemetry is collected. If using an API, sometimes you can disable or minimize it (Azure allows opting out of certain data collection in some cases). Zero-retention options aim to eliminate logs beyond 30 days. But as we saw, legal events can override that[29]. So planning for log risk is key: for instance, if legally compelled, your data might sit in logs longer – are you okay with that? Perhaps design prompts so they don’t include directly identifying info, so even if logs leak, it’s not obvious (this goes back to earlier mitigations like pseudonymization). Audit Logs as Vulnerabilities: Ironically, the logs we keep for audit could themselves leak if attackers get them. For example, if a company logs every prompt employees make (to ensure policy compliance), an attacker who breaches that log store might get a trove of sensitive info (since that log is basically a copy of everything employees tried to send to AI!). Therefore, protect these logs just like any sensitive database: encrypt them, restrict access strictly (only security team can access, perhaps requiring multi-factor), and set retention limits (don’t keep them forever unless needed). Also, consider anonymizing logs – e.g., instead of storing full prompt text, store a hash or a classification (“contained client data: yes/no”). Balance is needed: enough logging to investigate incidents, but not so much that the logs become a honey pot of secrets. Monitoring vs Privacy: Observability can conflict with user privacy or other regulations. For instance, recording everything an employee types into an internal AI tool might raise works council issues in some countries (like surveillance of employee activity). Solutions need to be mindful – perhaps monitor at aggregate or only flag specific patterns rather than store everything. Engaging stakeholders (HR, legal) in deciding log policy is important to avoid new problems. If employees fear that using an AI tool gets all their input recorded and scrutinized, they might avoid using the sanctioned tool and go rogue to a personal device. So transparency about what is logged and why is needed to maintain trust and compliance with labor laws. In summary, while logging and observability are crucial for detecting and preventing leaks, they themselves must be designed securely. Think of it like CCTV in a bank – yes it records robbers, but if the tapes are stolen, they might show sensitive info. So you guard the tapes too. With mitigations laid out, it’s clear this is an evolving field. We have addressed immediate known issues, but open research questions remain as technology and usage patterns change. We conclude with those open questions to acknowledge the limits of current knowledge and to encourage ongoing inquiry. Open Research Questions Despite rapid progress in understanding and managing AI-related data leakage, several unresolved questions and challenges remain. These need further research, tooling, or even regulatory guidance in the next few years. We list some of the most pertinent open questions: How to Verify Model Deletion of Proprietary Data: When a company asks an AI provider to delete specific data (or when it opts out of data retention), how can one be sure the data isn’t still embedded in model weights or lurking in backups? Today, model “forgetting” is an unsolved problem. If Samsung’s code went into GPT-4’s training, is it even possible to remove its influence without retraining from scratch? Research into machine unlearning and efficient re-training is needed. Regulatory pressure might require proof of deletion of sensitive data from models, which currently we lack methods to audit. This raises the question: do we need tools to probe models for specific memorized strings as part of compliance audits? Some researchers (like Carlini et al.) are pioneering this probing[47], but it’s not a standard capability offered by providers yet. Legal and Ethical Boundaries of AI Training Data: There is debate if using proprietary user prompts to improve models constitutes an IP violation or trade secret misappropriation. For instance, if ChatGPT regurgitates a company’s internal code to another user, is OpenAI liable? Laws haven’t caught up. Open research is needed in the intersection of IP law and AI: defining what constitutes misuse of proprietary data when it’s transformed by a model. Also, ethically, should AI companies segregate training data by source to prevent cross-pollination of confidential info? Perhaps we’ll see “consent-based training” where enterprise data is only used in models destined for that enterprise. This is both a technical and legal negotiation space that’s currently unresolved. Balancing Privacy with Model Utility (Differential Privacy): Can we develop techniques that allow models to learn from data without memorizing specifics – e.g., differential privacy guarantees – while still being useful? There’s ongoing research into training large models with DP, which adds noise to prevent memorization of any single data point. However, DP often degrades model performance if done strictly. Finding a sweet spot where an LLM can be trained on, say, personal communications but mathematically guaranteed not to output someone’s phone number or verbatim sentences is an open challenge. Achieving enterprise-grade models that inherently don’t leak specifics would be a game changer. OpenAI and others might be exploring this but nothing public yet indicates strong DP in GPT-4. It’s an open question if we can have “safe learning” algorithms at scale. AI Watermarking and Origin Tracking: If proprietary data does leak via an AI, how can we trace it back? For example, if a snippet of secret code appears in the output to someone, can we determine it came from Company X’s prompt originally? Current models don’t annotate outputs with their source. Research into watermarking AI outputs or logging hidden trace codes might help attribute leaks. OpenAI could potentially insert invisible tokens or patterns when specific training data is used, but that’s speculative. Alternatively, if many companies fine-tune their own models, how to tell if a piece of leaked text was produced by an AI vs a human? Watermarking could identify AI-generated text and perhaps identify which model (if each model has a signature). This might help in incident investigations (“this leaked text clearly came from an AI with OpenAI’s watermark, so an employee likely prompted it out”). It’s open how robust and widely adopted such watermarks will be. Human Factors: Preventing Social Engineering via AI: Another angle – an attacker could trick an employee into leaking data through AI. For instance, by engaging them on a platform and suggesting “You can just use ChatGPT to give me the answer to that problem” (knowing the user will paste content). Or worse, an attacker might use their own malicious AI that employees use unwittingly (a fake “internal helper” that is actually logging everything). There’s an open question of how to train users to spot when it’s safe or not safe to use AI in a workflow from a social engineering perspective. This overlaps with classic security awareness but with AI in the loop (“Don’t paste company info into any tool just because someone online suggests it”). Effectiveness of Redaction and Anonymization: Many mitigations assume we can redact or anonymize data before sending to AI (remove PII, mask names). But how effective are these in practice? Research could explore whether models can re-identify anonymized data (e.g., if a prompt says “Client [ID123] is a Fortune 500 pharma company CEO…”, could the model guess who that is?). Or if code variables are renamed, does it truly protect IP or can the model still inadvertently expose logic? There’s an open area in seeing if these partial measures truly mitigate risk or just give a false sense of security. Formal studies on “anonymized data leakage via AI” would help guidelines. Improving Model Access Control: Today, if an employee has access to a model, they can query anything. But maybe we need role-based access control (RBAC) at the model interface – certain employees can only ask certain things. E.g., a customer support agent’s AI interface might automatically prevent them from asking for code-related outputs (so they can’t probe the model for secrets outside their domain). Implementing such fine-grained control is an open engineering question. Similarly, can we restrict context windows such that a model provided to a contractor cannot retain information beyond one query (to avoid accumulation of data they shouldn’t see)? These are architectural controls still to be developed. Incident Response and Remediation: When the inevitable happens and a data leak via AI occurs, what’s the best way to remediate? If a secret formula got out via a model, is the only recourse to consider it compromised and change strategy (like you’d change a password)? Unlike passwords, you can’t change a leaked piece of source code easily – it’s out there. Open question: should companies proactively “poison” models to make any leaked data unusable (as in insert lots of noise examples so the model is uncertain)? That’s sci-fi for now, but maybe in future a company whose data got into a model could submit a special fine-tune to that model to bias it away from revealing certain content. This touches on ethical ground too, like one company influencing a public model. Regulatory Evolution and Standards: It remains to be seen how regulators will impose requirements on AI providers or users regarding proprietary data. Will there be standardized AI data governance frameworks? Possibly industry consortia might develop standards (like “AI Security Standard 1.0” analogous to PCI-DSS for credit card handling) that outline technical and process controls for handling sensitive data in AI contexts. This is more of a policy research area: how to codify best practices into regulations or certifications so that when enterprise uses AI, they can demonstrate compliance. Efforts by NIST (in the US) or the EU AI Act might tackle some pieces, but today there’s a gap. Research and policy work in the next 5 years will likely fill in this space, answering questions like: Should training on personal data require explicit consent (if so, how to implement at scale)? Should companies have to keep inventories of what data they feed to AI (for audit)? There’s a lot to explore on aligning AI data handling with existing compliance frameworks, possibly creating new frameworks. In conclusion, while we have strategies to mitigate current known issues, the rapid evolution of AI means organizations and researchers must continuously engage with these open questions. By addressing them, we move towards AI systems that can be safely integrated into workflows without routine leaks of our most valuable information. The challenge is ensuring security and compliance keeps pace with innovation – a theme likely to define the next few years of AI adoption in the enterprise.

[1] [21] [55] [56] [58] [77] [81] [83] 77% of Employees Leak Data via ChatGPT, Report Finds [2] [50] trade secret | Wex | US Law | LII / Legal Information Institute [3] [4] [5] [6] [7] [12] What Is Data Classification? - Palo Alto Networks [8] [59] [60] Is ChatGPT HIPAA Compliant? Updated for 2025 [9] 22 CFR § 120.33 - Technical data. | Electronic Code of Federal Regulations (e-CFR) | US Law | LII / Legal Information Institute [10] What is ITAR Compliance? [11] [84] Why companies are banning Chat-GPT [13] [14] [19] [20] [22] [54] [61] [63] [64] [65] [66] [70] Samsung employees leaked corporate data in ChatGPT: report | CIO Dive [15] [16] [39] [40] [75] [82] Demystifying GitHub Copilot Security Controls: easing concerns for organizational adoption | Microsoft Community Hub [17] [18] [78] Massive VS Code Secrets Leak Puts Focus on Extensions, AI: Wiz - DevOps.com [23] [24] [37] [41] [42] [43] Be Careful What You Tell Your AI Chatbot | Stanford HAI [25] [26] [27] [34] [35] [36] Do Humans Read Your ChatGPT Chats? OpenAI's Review Policy Explained [28] [29] [38] [85] [86] How we're responding to The New York Times' data demands in order to protect user privacy | OpenAI [30] These Chrome extensions read your ChatGPT and DeepSeek chats [31] Malicious Chrome Extensions Steal ChatGPT Conversations [32] How to opt-in for zero data retention with Azure OpenAI service? [33] 57% of enterprise employees input confidential data into AI tools ... [44] [45] [46] [47] [48] [49] [71] [74] Extracting Training Data from ChatGPT [51] Samsung Bans ChatGPT Among Employees After Sensitive Code ... [52] [53] [68] [69] Amazon Warns Employees to Beware of ChatGPT [57] [80] Is Your Data Leaking via ChatGPT? | CSA [62] Samsung introduced ChatGPT less than 20 days ago, and 3 leaks of ... [67] Incident 768: ChatGPT Implicated in Samsung Data Leak of Source ... [72] [PDF] Extracting Training Data from Large Language Models - USENIX [73] [PDF] Extraction of Training Data from Fine-Tuned Large Language Models [76] 11% of data employees paste into ChatGPT is confidential [79] Why doctors using ChatGPT are unknowingly violating HIPAA

About MLNavigator Research Group

We explore verifiable, offline AI systems and publish working notes as the research develops. If you want to discuss collaboration or pilots, reach out.

Contact Research Pillars