This research examines the end-to-end development of an AI-enhanced content management system (CMS) review tool within a K-12 educational organisation using integrated digital workflow design, product design methodologies, and software development methodologies in the form of Agile development practices.

The project employed Agile methodologies to systematically develop an internal CMS AI reviewing tool, utilising digital workflow design principles to optimise content review processes and product design practices to ensure user-centred functionality. Staff were educated on providing structured input to development through established Agile processes, enabling rapid iteration and deployment of the AI-powered review system.

This research contributes to understanding how K-12 organisations can effectively leverage AI technologies through structured development methodologies, fostering a culture of innovation and continuous improvement. The findings suggest that combining digital workflow design, product design thinking, and Agile practices creates a robust framework for implementing AI solutions that address specific educational challenges while maintaining pedagogical integrity.

System Design and Implementation

The design phase employed an iterative approach prior to the commencement of programming and development, allowing for validation of the design and the collection of user feedback (Fessenden, 2024). Before any prototypes were created, a comprehensive review of comparable products in the market was conducted.

Products in the CMS AI review space were rare due to the unique use case of this project. Products on the market are designed to produce AI generated content, not review it.

Due to this, other product categories were explored around AI Workflows and CRM Contact Enrichment features.

AI providing recommendation based on page (article) content. Atlassian.com

Visual indicator of when content is generated from AI. Attio.com

Can set explicit data sources to enrich a new column about a CRM contact Clay.com

Data can be fed in from live sources and update on the UI in realtime. Hopted.com

AI making recommendations based on previous experience, and surfacing this information to users. Linear.app

Each column can have a unique and specific prompt to achieve an outcome. Notion.com

Using input from another column and unique LLM prompts in other columns, the LLM can generate discrete results on various prompts. Prompts can be chained together. V7labs.com

Implementation

Guided from the initial design process, a high-fidelity prototype was built targeting:

Intelligent multi-stage content analysis for automated decision-making, and
AI-powered reasoning integration for enhanced accuracy.

Point A implements a novel five-stage analytical framework evaluating content freshness, editorial health, historical risk patterns, and quality metrics to generate actionable recommendations. This is a significant advancement over traditional binary content management systems. Point B enables configurable AI model selection between rapid prototyping (llama3.2) and advanced reasoning (deepseek-r1:70b) based on accuracy requirements.

This prototype represents a novel application combining automated content analysis with human-readable reporting, introducing comprehensive risk-based evaluation methodology that considers historical patterns and predictive failure analysis.

Technology Stack

TypeScript – TypeScript extends JavaScript with static type definitions, providing compile-time error detection and enhanced IDE support. The type safety mechanism reduces runtime errors while maintaining JavaScript compatibility, making it ideal for large-scale application development.

React – React implements a component-based architecture with a virtual DOM for efficient UI rendering and state management. The framework's declarative approach and hook system enable reusable, maintainable user interface components with predictable data flow patterns.

Supabase – Supabase provides a PostgreSQL database with auto-generated REST API endpoints via PostgREST, eliminating manual API development. The Dockerised environment ensures consistent deployment while offering persistent storage, B-Tree indexing for query performance, ACID compliance for data integrity, and horizontal scalability.

Next.js – Next.js is a full-stack React framework that provides server-side rendering, static site generation, and automatic code splitting for optimised performance. The framework includes built-in routing, API routes, and deployment optimisations, enabling scalable web applications with enhanced user experience and reduced loading times.

Drizzle-ORM - Drizzle-ORM provides a lightweight, type-safe Object Relational Mapping solution that maintains close-to-SQL syntax for database interactions. It enables schema definition, SQL query construction, and type-safe database operations while preserving the power and flexibility of raw SQL.

Ollama - Ollama enables local deployment of large language models, providing llama3.2:latest for rapid testing and deepseek-r1:70b for advanced reasoning tasks. The system allows user-configurable model selection based on specific requirements for performance versus accuracy trade-offs.

Tanstack Table – Tanstack Table delivers a headless, framework-agnostic data table solution with advanced features including sorting, filtering, pagination, and virtualisation. The component provides extensive customisation options while maintaining performance optimisation for large datasets.

Shadcn/ui – Shadcn/ui offers a collection of copy-paste React components built with Radix UI primitives and styled with Tailwind CSS. The library provides accessible, customisable UI components that maintain design consistency while allowing full control over styling implementation.

Tiptap – Tiptap provides a headless, extensible rich WYSIWYG text editor built on ProseMirror, enabling editing and custom content rendering. The editor supports markdown shortcuts, custom extensions, and provides a modern interface for visualising and editing article content.

The existing Intranet content was loaded into a local-only Supabase database instance. Once running, the Next.js application can render these Intranet articles for users using Tanstack table to review AI LLM article reviews, Next.js dynamic routing to view individual articles, and Tiptap to render HTML article content. From the dashboard, users can initiate a scan of all files, or specific folders. When the request is initiated, it sends a batch job request to the server via a Next.js API route. This job request begins cycling through the requested list of article reviews. One by one, the request sends the Intranet article content and metadata to the Ollama LLM service in 5 distinct requests. These results are then surfaced to the user. Users are able to provide feedback back to the LLM via a ‘User Review’. This will store a review against the LLM review, that future LLM reviews can then refer to correct its behaviour and analysis.

Stage 1: Content Freshness AssessmentThis stage evaluates how current and relevant the content remains over time.

Key Metrics:

Freshness Score - Numerical rating of content currency
Evergreen Status - Whether content remains perpetually relevant
Time-Sensitive Indicators - Elements that become outdated over time
Outdated References - Specific items that are no longer current
Analysis Reasoning/Explanation
Confidence Level

Stage 2: Edit History AnalysisThis stage examines the content's maintenance patterns and editorial health.

Key Metrics:

Last Edit Age - Time since most recent modification (in months)
Edit Frequency - Average number of edits per month
Recent Activity Status - Whether content has been updated recently
Edit Pattern Type - Classification of changes (minor tweaks, major revisions)
Health Score - Overall editorial wellness rating (1-10 scale)
Analysis Reasoning/Explanation

Stage 3: Historical Risk AssessmentThis stage identifies potential problems based on past performance and failure patterns.

Key Metrics:

Historical Issues - Previous problems encountered with the content
Failure Patterns - Recurring types of issues or errors
Potential Issues - Anticipated future problems based on trends
Risk Score - Overall risk level rating
Analysis Reasoning/Explanation
Confidence Level

Stage 4: Content Quality EvaluationThis stage measures the overall quality, accuracy, and usefulness of the content.

Key Metrics:

Quality Score - Overall content quality rating (1-10 scale)
Accuracy Issues - Identified factual errors or inaccuracies
Completeness Issues - Missing information or gaps in coverage
Usefulness Score - Practical value to users (1-10 scale)
Best Practice Alignment - Adherence to industry standards (1-10 scale)
Analysis Reasoning/Explanation
Confidence Level - Certainty in the evaluation

Stage 5: Final RecommendationThis stage synthesises all previous analyses into actionable recommendations.Final Decision Options:

KEEP - Content should remain as-is
UPDATE - Content needs revision refresh
REMOVE - Content should be deleted or archived

Supporting Information:

Confidence Level - Certainty in the final recommendation
Decision Reasoning - Comprehensive explanation of the recommendation
Complete Stage Results - Full details from all four analysis stages
Weighted Factor Scores - Relative importance of each analysis dimension:
- Freshness impact
- Editorial health impact
- Historical risk impact
- Quality impact

User Interface

Figure 1 - Content Management Review System Dashboard

The above figure demonstrates the Content Management Review System dashboard. From here, the user is able to:

Review previous LLM article reviews
Click to navigate to the article page
Start and stop a bulk analysis
Check the status of the local LLM service and view the current select LLM model
Configure the bulk analysis settings to control:
- Delay between each article analysis
- How many articles can be reviewed concurrently (batch size)
- Exclude specific folders and files based on their unique ID or name
- Select which model will be used for the analysis
Quickly view user feedback outcomes
View current system CPU usage, memory usage, and recent activity.
Navigate the file structure to find a specific folder and/or article in the Intranet

Figure 2 - Live LLM review in progress

The user can review the current progress of an article review being conducted by the LLM. The system ID, article ID, article name, data quality, confidence, analysis date, final recommendation, freshness score, edit health, historical risk, and quality score can all be quickly seen in a tabular format.

Table cells will update once the LLM completes the request and the result is stored in the database.

Figure 3 – Dashboard analytics

A user dashboard to quickly visualise results of LLM article reviews. Result accuracy may vary in the test environment.

Figure 4 - User feedback mechanism analytics

User dashboard to see and summarise the user feedback mechanism results that will feed the LLM human-corrected revisions of previous LLM responses. I.e. if a LLM response was deemed ‘incorrect’ by the user, the user can correct the LLM review. This correction is stored in the database separately to be reused by the LLM in future reviews.

Figure 5 - User feedback on a LLM article review result

On an individual article, the user can view previous LLM article reviews. These can be investigated further by the user where they can access the exact reasoning for each decision by the LLM. A user can then correct and/or provide feedback of that LLM review. This feedback can be used by the LLM in the future for subsequent reviews to reduce the chance of the LLM making the same mistake.

Figure 6- Individual Article with metadata

The user can navigate the folder and file tree to find a specific Intranet article. When clicked, the user can view the article metadata.

Figure 7 - Individual article content

Completing the above steps, the user can scroll down to view the article content rendered in HTML.