Testing Github Copilot
in action

By Miika Oinonen

November 6th, 2025

A quick summary of this post:

I set out to test Agentic coding and Github Copilot by trying to create a portfolio site for myself without writing a single line of code.
I briefly explain how agentic AI workflows work, and how it affects my workflow as a developer
I test Github Copilot in practice by building a portfolio site without writing a single line of code.
I share my experience with Github Copilot based on the portfolio I built.
In addition, I also share some of my thoughts about the general experience and value proposition of the agentic AI development workflow.

The landscape of software development is undergoing a rapid and profound transformation. For years, the primary interaction with large language models in a coding context was through simple, single-response mechanisms, such as autocomplete or the generation of small code snippets.

This generative AI paradigm was characterized by a one-and-done workflow: a developer would provide a prompt, the model would produce an output, and the task was considered complete until the next prompt was issued.

Hello, agentic AI assistants!

Now, a new paradigm, known as agentic coding, has emerged, shifting this workflow from a series of individual prompts to a more autonomous, goal-oriented process. In this new model, a developer can delegate a high-level task to an AI agent, which then works independently for an extended period, requiring minimal feedback or instruction.

I set out to test this by trying to create a portfolio site for myself without writing a single line of code. First, however, it would be beneficial to look at exactly what is agentic coding and how it works.

What is agentic coding?

At its core, agentic coding is defined by a system built around a LLM that functions as an active collaborator or teammate. Unlike its generative predecessors, an agentic system does not simply respond to a query; it acts on a goal, manages tasks, and learns from its results. This functionality is underpinned by several core components that allow it to operate with a a limited degree of independence - but independence nonetheless.

How does agentic coding work?

The way agentic AI assistants work can be be briefly summarized into 3 steps: Defining the scope and goals, Execution and Reflection/Iteration. Let me dive a bit deeper into each topic, so we can better understand what each step consists of.

1) Defining the scope and goals

First, every AI agent begins with a clearly defined, high-level goal provided through natural-language prompts. This goal can be as simple as implementing or changing a single function or as complex as building an entire application from scratch - as was the case in my experiment. The agent's ability to operate autonomously is predicated on its capacity for multi-step planning.

For example, upon receiving the prompt to build a portfolio site, the agent created a todo-list in the pull request that it constantly updated of planned and completed tasks. This decomposition of a complex objective into smaller, manageable sub-tasks is a defining characteristic of an agentic workflow.

2) Execution

To execute these plans, the agent must be able to use tools. These tools are not just limited to code generation but extend to the environment itself, allowing the agent to access files, run terminal commands, and interact with version control systems. Copilot accesses these tools by using MCP servers. By default, it seems two are included in the agent session: github-mcp-server, which contains platform tools for interacting with Github and the repository; and playwright mcp server which allows the agent to interact with a browser to test out the ui it is creating.

3) Reflection and iteration

The final crucial component is reflection and iteration. An agent can evaluate its own actions, make adjustments to its strategy based on the outcome, and refine its plan over time to achieve the desired result. This continuous feedback loop is what allows agentic systems to handle complex, dynamic problems that would break traditional, rule-based automation

What value does agentic coding add?

Less time spent coding
Cognitive offload for developers
Shifting developer focus from mechanics to strategy
Developers can act as higher level "operators"
Frees developer time and focus for more important tasks

The primary value proposition of this new workflow is not just a reduction in coding time but a significant cognitive offload for the developer. By delegating a complex, multi-step goal, the developer is freed from the mental burden of task decomposition and planning.

The agent's creation of a to-do list demonstrates that it is taking on the planning function, shifting the developer's focus from the mechanics of "how do I do this" to the strategic vision of "what do I want to build".

This allows developers to operate at a higher level of abstraction from the outset, focusing their efforts on intricate work and problem-solving rather than repetitive, boilerplate tasks, be it setting up layouts or writing code to do basic data fetching. At the same time, this freeing up of mental processes does not come free. Instead, the focus shifts from writing code to reading code, and understanding what has been done.

My experience with
agentic coding
and Github Copilot

To test out this agentic coding workflow, I set out to create a portfolio site for myself using only Github Copilot. On creating a new repository, Github enables you to insert instructions to Copilot on what you want to build. It will then take that prompt, create a pull request, and start working as described above.

1) The first step:

My initial prompt included wanting to create a portfolio to showcase my work, built with Astro, and should include a landing page, a page for listing my projects, an about page and a page for blog posts. It would also include an integration with the CMS Sanity.

Copilot created a pull request, and got to work, creating a session where I could watch it work. Similar to agent mode in VSCode, in the session view you can see the so-called “thought process” as the agent works as the agent narrates what it is doing at any given point. At the same time, a differential is provided for the changes that are done and in theory you can follow along as the code is modified.

The first session took almost 30 minutes, which is understandable as the agent setup a whole project from scratch. The result, however, was quite acceptable, with everything working and looking, if not unique then at least passable and professional enough. Copilot provided a very thorough readme in the pull request, and even some screenshots which were actually accurate and not hallucinated.

2) A follow up task

For a follow-up, I tasked Copilot with helping me set up self-hosting Sanity with the project, as well as tweaking the ui to be a bit more pleasing to my eye: mainly changing the color scheme. These tasks, again, Copilot succeeded in, taking about 20 minutes for each.

The studio hosting went without a hitch, and the color scheme refactor made the site a bit more pleasant to look at: the previous one, while passable, had some interesting color choices for gradients. Overall, then, the process and the result seem, at first glance, quite positive: a working site in a few hours, overall.

3) Some caveats

“At first glance” is the important sentence to concentrate on. While the speed of agentic workflows is impressive, the quality and maintainability of the generated code present a more nuanced picture.

My experience, which resulted in a "workable" site in a few hours, came with a critical qualifier: the code had several issues that would have caused compile errors in a stricter environment, and it required some follow-ups to address subjective design choices that were screaming; AI built this.

This experience echoes a broader discourse on the "productivity paradox" of AI-assisted development: is the speed gained on the front end lost to hours of manual review and cleanup on the back end?

4) Can it be improved?

The answer to most of my issues is, of course, careful configuration of rules for the LLM to use, as well as carefully crafting the initial prompt to include as much detail as possible. This could enable the model to be more unique in its styling and adhere to stricter protocols.

Of course, at the same time this only reinforces my earlier thesis of the potential productivity gains, timewise, moving from actually producing code to the process of planning out and writing the guard rails for your model, as well as instructing it, followed by - hopefully - careful analysis and review of the resulting code.

Does agentic coding, then, fulfill the promise it makes; the promise of reduced cognitive load and freeing up time to do something else? The answer, of course, is not so simple. In my experience, it does free up time to concentrate on something else for a while, as the agent can be set on a task and left on its own for a good half an hour. As such, the agent can be set on a more menial or mundane task, or perhaps even something more substantial such as setting up a repo with a template.

Conclusion and some thoughts on the state of agentic coding...

Whether the agent eases cognitive load is a much more complex question. As large language models are currently quite adept and solving programming tasks, especially in a popular language like Javascript, it does free the programmer from having to think of programmatic solutions to their problems.

However, I would argue that this load only gets shifted elsewhere, as can be seen from my experience with the Copilot agent. Significant effort still has to be made to - if an optimal result is what is desired - craft guard rails for the model in the form of rules it has to follow, as well as crafting a detailed prompt.

Then, after the agent has finished its work, significant effort has to - or at least should - be made to ensure that the resulting code is clean, well crafted and free of subtle bugs and security flaws.

In the end, then, agents can be seen as bringing a shift to workflows and perhaps the skills needed to oversee them, moving from a programmer needing good skills in producing code, to them needing to be experts in reading it and detecting flaws and pitfalls.

Contact us