
In the beginning of 2023, I began doing some research into how generative Artificial Intelligence (AI) might be used within the field of software architecture.
Having over 27 years’ experience as a software architect, I gained an appreciation for the complexities of software architecture and felt there must be some way to leverage this latest trend to assist with the myriad complex decisions software architects face. This article describes some of my research into the possibilities of how AI can be used within the software architecture field.

Defining Software Architecture
Software architecture is the structure of a software system or product, like the way the architecture describes the structure of a building. For example, the architecture of an office building has a particular shape, external and internal walls, multiple floors, a roof, and so on. How you decorate each office or meeting room doesn’t impact the structure of the building (that would be considered design).
Software architecture describes the characteristics a system must support, orthogonal to the functionality of the system of product. For example, does the system need to scale to a certain number of concurrent users? Does it require a certain level of responsiveness or performance? Does it need to be available all the time? These characteristics are a large part of software architecture.
Ignoring architecture in a software system would be like ignoring architecture in an office building. Can you imagine a group of bricklayers, electricians, plumbers, and framers all getting together and start constructing a 10-story office building from scratch without an architecture in place? Unfortunately, this is what happens with many software systems and products; developers just start coding the functionality, ignoring the overall structure. The functionality may work perfectly, but it might not be able to scale over 5 users, the response time might be so bad the system becomes unusable, and the system might become near impossible to maintain, test, and deploy. Those factors are all about architecture, which is why it’s a critical part of a software system or product.
Generative AI and Software Architecture
A compelling question I had in early 2023 when generative AI started becoming popular was “Can AI help me as a software architect?”. I thought about all the activities I perform as a software architect, and a few came to the surface as being possibilities for a generative AI engine to help me with:
· Risk assessment: “Are there risk areas within the architecture?”
· Risk mitigation: “How should I address the risk?”
· Anti-Patterns: “Are there any common anti-patterns in the architecture?”
· Decisions: “Should I use orchestration or choreography for the workflow?”
I was excited to see how generative AI might be able to help me in answer some of these questions. However, for an LLM (Large Language Model) to help me answer these questions, it must first understand my architectural solution.
Modeling Software Architecture
For an LLM to understand and learn about an architectural solution, it must be described in detail in the form of prompts. You could write something like
“Suppose the architecture consists of 10 independently deployed services, with each owning their own data. Here’s what each of the services do and how they interact with each other…”
but this sort of prompt is completely unstructured and would take forever to write. Besides, what if you consolidated some of the services or added new ones? The LLM would have to understand changes to the architecture as well, making it an even more complicated task. This is where standard architecture modeling tools help.
I turned to two popular software architecture modeling tools, ArchiMate and Structurizr, to see if ChatGPT could understand my architectural solution. I created a simplified architecture using based on a highly responsive and elastic standardized student testing system I worked on in the past and rendered it in both modeling tools.
I then wrote some prompts using the exported XML, and asked ChatGPT to describe my architecture:

ChatGPT:
This is an ArchiMate 3.0 model of the MTG architecture. ArchiMate is a modeling language for enterprise architecture that provides a common language for describing enterprise architecture artifacts. The model contains eight elements and six relationships between them. The elements are defined by their type, such as BusinessActor, SystemSoftware, TechnologyService, TechnologyEvent, TechnologyCollaboration, and Device. The elements have unique identifiers, names, and additional properties.
The elements in the model include:
• Student: a BusinessActor representing the user of the system.
• Test Database: a SystemSoftware representing the database where the test questions are stored. It has two properties: “Relational” and “300”.
• Signon Service: a TechnologyService representing the service responsible for authentication and user login.
• TestTaker Service: a TechnologyService representing the service responsible for serving test questions to the user.
• AutoGrader Service: a TechnologyService representing the service responsible for automatically grading user responses.
It didn’t surprise me that ChatCPT was able to parse the XML-based ArchiMate and Structurizr versions of my architectural solution and describe my architectural solution in English words—that’s interesting, but basic stuff from an AI perspective and not overly useful. However, I should note that I didn’t supply the description of the artifacts in the ArchiMate or Structurizr XML – the LLM added these descriptions, and they were surprisingly accurate. The LLM now had a context to begin analyzing my architecture, and I could use that standard model to feed additional prompts, no matter how often I changed the architecture.
Analyzing Software Architecture
I purposely added an obvious bottleneck in my architectural solution—a single, monolithic database that would likely not scale and perform to the specific requirements (200,000 concurrent users with an average response time of 600ms under full load). I added additional prompts to describe these constraints, as well as other details about the system (such as that the TestTaker Service would need to read the test questions from the database for each student, that these were standardized tests, and so on). Then I asked ChatGPT whether it saw any problems or bottlenecks with my proposed architectural solution:
ChatGPT:
The ability of an architecture to scale to 200000 concurrent users depends on several factors, including the architecture itself, the hardware and infrastructure on which it is deployed, the efficiency of the code, and the performance of the underlying database system.
Assuming that the architecture is designed to handle such a large number of concurrent users and the hardware and infrastructure are capable of handling the load, the bottleneck in this scenario is likely to be the database.
Based on the information provided, the relational database with 300 connections and 2TB in size is capable of handling a significant amount of data. However, the average query time of 20 milliseconds and an average throughput of 100000 requests per second suggest that the database may struggle to keep up with the demands of 200000 concurrent users.
Again, I wasn’t overly surprised that it identified the database as a potential bottleneck—this is basic knowledge, something I expected the LLM to identify. After all, statistically the database is usually the highest probability for scalability and performance bottlenecks.
Then, to make things interesting, I asked it what it would do to mitigate this risk:
ChatGPT:
Both replicated and distributed caching can help improve performance and scalability of an application. Replicated caching involves storing copies of the cache data on multiple nodes, while distributed caching involves partitioning the cache data across multiple nodes.
Caching is an obvious solution for removing both load and dependencies on a database, but nevertheless I felt I might be getting somewhere. Now it was time for the real test—what sort of cache would you recommend I use?
ChatGPT:
In your case, since the application is read-heavy, you may benefit more from distributed caching, as it can provide better scalability and performance than replicated caching. With distributed caching, the cache data is partitioned across multiple nodes, allowing for more concurrent reads and reducing the load on the database. There are several distributed caching solutions available, such as Apache Ignite, Hazelcast, and Redis. You can evaluate these solutions and choose the one that best fits your needs and budget. Keep in mind that distributed caching adds complexity to the architecture, so you should weigh the benefits against the added complexity and maintenance overhead.
Much to my dismay, this was not the answer I was looking for. Given the static nature of test questions for any given standardized test, the more appropriate answer would have been a single in-memory cache for each instance of the service, not a distributed cache. Furthermore, its definition of a replicated cache was not in line with another possible option, in-memory replicated caching.
A Dead End?
Not quite. I continued this research, adding many more details regarding the architecture, the constraints, the environment, and so on. Each time I asked the LLM to identify risk areas, it found some, but missed others. When asked about what anti-patterns it saw in my architectural solution, all it did was give me a definition of some of the common software architecture anti-patterns, but not ones within my architecture. At a minimum, I figured it would mention the use of a single monolithic database in a highly scalable distributed architecture as an anti-pattern, but maybe that needs to be officially coined before an LLM can identify it.
This research was useful in confirming what we already know about software architecture—that everything in software architecture is a tradeoff. Maybe a monolithic database is appropriate because of high semantic coupling of the data or the need for high levels of data integrity and consistency. Maybe in-memory caching wouldn’t work because of the large amount of data contained within the memory of each service instance. There are no best practices within the structural aspect of software architecture, meaning that the decisions architects make require wisdom. LLMs have plenty of knowledge, but not wisdom.
My research, albeit limited, demonstrated to me that we are a long way from having an LLM create a software architecture or help identify potential risks and problems that aren’t already commonly known by most software architects. However, I thought of another way AI can help us with software architecture.
Autonomic Systems
Back in the late 90’s IBM (as well as a host of other companies) conducted research on what are known as autonomic systems. Autonomic systems are self-managing systems that can adapt to unpredictable changes while hiding complexities away from operators or users. Every autonomic system must be able to exhibit the following set of properties, all without human intervention:
Self-Aware: The ability of the system to monitor and assess it state.
Self-Configuring: The ability of the system to change it configuration.
Self-Healing: The automatic discovery and correction of faults.
Self-Protecting: The proactive protection from arbitrary attacks.
You might remember the 1968 film 2001: A Space Odyssey, and one of its main characters, HAL (Heuristically Programmed Algorithmic Computer). The HAL 9000 was the intelligent supercomputer that ran Discovery One, the ship headed for Jupiter to investigate the famous monolith. HAL was, in fact, an autonomic system that exhibited all the properties previously listed. Simply put, HAL was an AI system. We add this kind of intelligence into our business systems, leveraging AI within our software architectures rather than have it act as an architecture analysis companion.
Research into autonomic systems gave rise to a series of architectural patterns that were derived largely from the Reactive Manifesto, published in September 2014 by Jonas Bonér, Dave Farley, Roland Kuhn, and Martin Thompson. As quoted on the reactive manifesto website, “Systems built as Reactive Systems are more flexible, loosely-coupled and scalable. This makes them easier to develop and amenable to change. They are significantly more tolerant of failure and when failure does occur they meet it with elegance rather than disaster. Reactive Systems are highly responsive, giving users effective interactive feedback.”
Turns out, reactive systems have properties that closely match those of autonomic systems and can act as a starting point for developing truly autonomic systems. These properties are as follows:
Responsive: the system responds in a consistent, rapid, and timely manner whenever possible; how the system reacts to users.
Resilient: the system stays responsive after a failure through replication, containment, isolation, and delegation; how the system reacts to failures.
Elastic: the system stays responsive under varying workload; how the system reacts to load.
Message Driven: the system relies on asynchronous messaging to ensure loose coupling, isolation, location transparency; how the system reacts to events.

So how does AI fit into all of this? By combining various reactive architecture patterns to create reactive systems, you can add a level of intelligence to business applications that mimic autonomic (AI-based) systems.
Summary
We are a long way from creating truly autonomic business systems, but patterns of reactive architecture, combined with basic AI and ML, give us a great head start towards that realization. Rather than focusing on how generative AI can be used to assist a software architect in creating, analyzing, and validating software architecture, we should be investing our time investigating how AI can be used to create truly self-aware, self-healing, self-configuring, and self-protecting software systems that can adapt to ever-changing technology, environments, business, and market conditions.
About the author

Mark Richards
View Mark Richards’s profile