A guide to agentic AI: from principles to practice

Agentic AI offers organisations significant opportunities to improve speed, productivity and decision-making but it also introduces a greater governance challenge.

In this third article, we look in more detail at agentic AI's ability to plan, act and adapt in live environments and the implications on its governance, particularly the need for this to move beyond static policy into operational control.

The shift from governing AI models in isolation to governing agentic system behaviour in an enterprise context has created a critical gap; governance centred on principles, policy and review now needs to be translated into controls across the full lifecycle of development and production, including important safeguards such as runtime controls.

Building trust and confidence in agentic AI

Trust and confidence in agentic AI are created by the presence of visible, testable and enforceable controls in production. It means being clear about:

  • What an agent is authorised to do
  • Which tools and data it can access
  • When human approval is required
  • How its actions are monitored
  • How it can be paused, corrected or shut down if something goes wrong.

Scaling agentic AI with control

The twelve steps detailed below are intended to provide the operating disciplines needed to scale agentic AI with control.

If responsible AI sets the intent, Systems Development Lifecycle Controls (SDLC) embed that intent into design, build, testing, release and change.

AgentOps - the emerging cross-functional operational discipline of running agents safely once they are live – sustains it in production through visible behaviour, enforceable boundaries, traceable actions, controlled change and rehearsed intervention.

Taken together, the twelve disciplines below describe the wider control environment needed for that capability to work in practice.

Twelve steps from intent to operational control

The twelve steps are the practical operating disciplines that translate Responsible AI intent into day-to-day control.

They should be applied proportionately depending on the agent’s level of autonomy, authority, business criticality and potential impact if something goes wrong.

Organisations should also consider how humans interact with agents in practice, ensuring users understand their limitations to prevent over-reliance.

  • 1

    Authority and autonomy boundary

    From the outset, the organisation should be clear about the agent’s role, what it is allowed to do, what systems and data it can use and where the limits are. One practical way to express this is through an agent constitution, a machine-readable set of rules, constraints and operating boundaries that defines what the agent is there to do and how it is expected to behave. Where agents rely on other agents, tools, or workflows, set clear rules for hand-offs, validation, and escalation so that errors, unsafe assumptions, or unexpected behaviour do not propagate across the chain.
  • 2

    Give it only the access it truly needs

    Give every agent its own distinct identity, not a human user’s credentials or a shared account with least-privilege access so that every action is traceable and no permission is granted without explicit need. In zero trust terms, agents should never be trusted by default and access should be continuously verified, tightly constrained and regularly reviewed throughout the agent’s life. Least agency, a new term coined by OWASP, extends least privilege to agentic applications.
  • 3

    Control the information it relies on

    Define what information the agent may keep, retrieve, carry forward and use, and for how long. Manage context carefully from one step to the next, and prevent sensitive information from leaking across users, cases, teams, customers or sessions. Poor, outdated, excessive or leaked information can drive poor outcomes just as easily as a flawed model.
  • 4

    Put clear ownership in place

    Assign named owners to every agentic use case. Be clear about who is accountable for its purpose, who approves access, who oversees performance, who can intervene if something goes wrong and who signs off major changes. Where the agent relies on third-party models, tools or vendor-supplied agents, ownership should also include who is responsible for vendor due diligence, change notifications, assurance evidence and ongoing oversight of those dependencies.
  • 5

    Ensure actions can be traced

    Maintain enough tamper-evident trace, logging, and observability to reconstruct, to an appropriate degree, what the agent did, when it did it, what information it relied on, which tools or systems it used, and who asked it to act.
  • 6

    Implement runtime monitoring and intervention

    Monitor the agent continuously for drift, unsafe patterns, policy violations, unexpected tool usage, abnormal escalation, degraded quality or changing risk once live. Put clear intervention measures in place before go-live. Be able to pause the agent, contain the issue and stop harm spreading if it behaves unexpectedly. Test the response in advance and assign clear decision-making authority.
  • 7

    Design oversight that scales and stays meaningful

    Define where approval is required before action, where live monitoring is sufficient and where review after the event is appropriate. Keep oversight proportionate and effective at scale, rather than allowing it to become a box-ticking exercise.
  • 8

    Set hard limits on spend and consumption

    Set limits not only on actions, but also on the amount of resource consumption and cost the agent can generate before it must pause, alert or seek approval.
  • 9

    Assess the impact before go-live

    Be clear about what could happen if the agent makes a poor decision or acts outside its intended role. The greater the potential operational, financial, regulatory or reputational impact, the stronger the governance, oversight and testing should be.
  • 10

    Test thoroughly before live use

    Do not allow the agent into production until it has been tested in realistic conditions. Cover normal use, unusual situations, failure scenarios and adversarial testing (e.g. attempts to manipulate, misuse or deliberately break the system).
  • 11

    Control change

    Apply governance whenever a material change is made, not just at launch. Treat changes to models, tools, workflows, prompts or connected systems as changes that may alter behaviour, and require review, testing and re-approval before go-live.
  • 12

    Keep reviewing it through its full life

    Do not treat the agent as ‘set and forget’. Keep reviewing it through its full life. Review its behaviour, access, risks and ongoing business fit over time and manage retirement carefully so data, access rights and evidence are closed down and preserved properly.  Put in place independent review, challenge and assurance so the organisation can test whether governance remains effective in practice.
  • These steps are what turn governance from policy into control in live operation. 

An example in practice: AI procurement agent

The example below shows how some of the twelve disciplines work in practice.

It illustrates the difference between giving an agent freedom to act and putting it to work within clear boundaries, ownership and control. 

Imagine an AI procurement agent working inside an ERP system. Its role is to raise purchase orders for approved suppliers within agreed limits.  

 

Without governance 

With governance 

1

The agent’s role is not controlled well, so over time it starts doing more than intended and begins to influence supplier and approval decisions that were meant to stay with people. The agent has a clearly defined role and operates within agreed boundaries, with supplier choices and approval decisions remaining where human judgement is needed. 

2

The method by which the agent is allowed to act is not properly secured, so delegation across systems or tools may occur without strong authentication, authorisation or traceability, increasing the risk of misuse or unsafe action. Agent delegation is security-protected, with strong authentication, explicit authorisation and clear traceability across tools, systems and other agents, so the organisation stays in control of how it is allowed to act. 

3

It relies on hallucinated content for context, creating the risk of poor judgement.  Governance should assume that unsupported conclusions, fabricated intermediate steps or incorrect inferences can arise through hallucination and should require proportionate validation and human review before those outputs are relied on or acted upon. 

4

There is no clear point at which a person must step in, so the agent can break a larger purchase into smaller ones and move ahead without appropriate review.Clear thresholds determine when the agent can proceed, when activity must pause and when a manager needs to review or approve. 

5

A key governance challenge is that organisations may not have the monitoring, alerting or observability needed to detect problems early, so they may not know anything is going wrong until harm has already occurred. Monitoring, alerting and observability are built into live operation, so unusual behaviour, policy breaches or emerging signs of drift are detected early and the organisation can intervene before harm spreads. 

This simplified example shows how several of the twelve disciplines work together in practice: 

  • Clear scope 
  • Named ownership 
  • Security-protected delegation 
  • Controlled and validated information use 
  • Proportionate human oversight 
  • Traceability 
  • Live monitoring 
  • The ability to intervene quickly 

Agentic AI governance maturity self-assessment

We have developed an agentic AI governance maturity assessment to help senior leaders judge whether their organisation is ready to put higher-autonomy and impact agents to work with confidence. It translates the twelve operating disciplines into five practical areas for which to assess maturity. 

The aim is not to be at the highest maturity level overall but whether your current level is strong enough for the agents you are already deploying or planning next. A low-risk assistant may be workable at a lower level of maturity. An agent acting across live systems or handling sensitive data will require a much higher one. The gap between the maturity you have and the maturity your use cases demand becomes the leadership agenda.

Complete your self-assessment

Dimension

Level 1

Ad hoc

Level 2

Basic

Level 3

Managed

Level 4

Embedded

Level 5

Leading

Scope and guardrailsNo one has clearly defined what the agent is there to do, what it must not do or where human approval is required.Some boundaries exist for certain use cases, but they are incomplete, inconsistently applied and not tied clearly to risk.A standard approach defines the agent’s role, permitted actions, boundaries, escalation points and where human approval is required.Scope and autonomy limits are risk-tiered, formally owned and approved before go-live and when material changes are made.Scope, autonomy boundaries and risk classification are actively maintained, monitored and updated as the agent, its environment or its role changes.
Identity, data and trust

The agent can access far more than it needs and there are no reliable controls over identity, permissions, source quality, memory or data leakage.

 

Some access controls exist and key sources have been identified, but identity, delegation, grounding and retention are only partly understood.The agent has a distinct identity, access is limited to what is needed, trusted sources are defined and rules exist for retrieval, memory, retention and cross-context data handling.Identity, access, grounding, retention and information flows are reviewed regularly, with controls to reduce excessive privilege, stale data and cross-context leakage.Identity, access and information quality are continuously monitored, with automated detection and review of misuse, leakage, drift, anomaly or grounding failure.
Testing, live safeguards and assurance

Testing is absent or informal and, once live, there is little to stop the agent acting outside expectations.

 

Some testing and safeguards exist, but they are patchy, inconsistent and not clearly linked to risk.Structured testing covers safety, reliability, misuse and business performance before launch, and key safeguards and approval thresholds are in place before go-live.Controls operate alongside the agent in production, with ongoing testing, monitoring and triggers for changing risk, behaviour or operating conditions.Continuous testing, internal challenge, drift detection and live assurance operate in production, and safeguards can be adjusted rapidly as risk or context changes.
Accountability and traceabilityNo one clearly owns the agent and there is no agreed process for escalation, intervention or remediation if something goes wrong.A project or technical owner is named, but responsibilities are unclear and oversight largely falls away after deployment.Clear business and operational owners are in place, with defined responsibilities for oversight, approvals, intervention, traceability and change.Escalation routes, review forums, traceability, governance reporting and incident handling are established, used and understood in practice.Leadership has a live view of agent ownership, risk and control status across the organisation, supported by governance reporting, a maintained risk register and strong traceability across the estate.
Change and lifecycleGovernance stops at launch and the agent is treated largely as set-and-forget.Some reviews take place, but changes to prompts, models, tools or workflows often go unmanaged.Structured review, change approval and periodic recertification are part of how the agent is run.Reviews are triggered by time, incidents and material change, with clear re-approval points and defined retirement steps.Change, recertification, retirement and evidence retention are tightly managed across the full lifecycle, including dependencies on third parties and connected systems.

Key take-aways from this guide to agentic AI

Agentic AI offers organisations a significant opportunity to improve speed, productivity and decision-making, but it also changes the nature of the governance challenge. Once systems can plan, act and adapt in live environments, trust can no longer rest on policy alone. It depends on whether an organisation can define clear boundaries, maintain visibility, intervene quickly and sustain control as these systems evolve. 

The organisations that succeed with agentic AI will not be those that move fastest without constraint, but those that build confidence alongside capability. Responsible AI principles remain essential, but they must now be embedded through disciplined design, testing, runtime monitoring, change control and clear accountability in live operation. That is what turns ambition into something scalable, defensible and trusted.

How we can help 

We bring together multidisciplinary teams spanning AI, data governance, cyber security, technology risk and legal to help organisations move from AI ambition to controlled, real-world deployment through pragmatic, right-sized governance that builds confidence and trust.

Contact us today

Document

Agentic AI from principles to practice - A C-suite guide

Want to know more?