ProRail teams up with Valcon for development of LLM-powered chatbot

ProRail, the government organisation responsible for the railway infrastructure in the Netherlands, has teamed up with Valcon to develop and test a new LLM-powered chatbot.
With the aim to explore the potential of large language models (LLMs), ProRail launched five rapid prototyping ‘pressure cooker’ projects to test LLMs against real-world use cases to see how they would operate. One of these pressure cooker tests was carried out in collaboration with Valcon.
ProRail manages an extensive set of design rules for railway stations across the Netherlands. These regulations, which cover everything from platform widths to elevator access and bicycle parking, are stored within the Rail Infra Catalogue (RIC), a vast database accessed by contractors, architects and inspectors.
But the RIC’s basic search functionality made it difficult for users to find the exact information they needed. To address this, ProRail engaged Valcon to explore how an LLM could improve information retrieval from this complex dataset.
RICO, an LLM-powered chatbot
In just one week, Valcon developed a working prototype of RICO, an LLM-based chatbot trained to answer queries about a subset of the RIC’s design regulations. Built using a Retrieval Augmented Generation (RAG) approach, RICO works in two stages: first of all, it finds the most relevant sections of the RIC, then uses the LLM to generate a clear and helpful answer.
For example, a user might ask, “What’s the minimum required platform width?” RICO would locate the relevant regulation in the RIC and then generate an accurate, readable response.
Unlike general-purpose models like ChatGPT, RICO is designed to stick strictly to the information in the RIC and to decline questions outside its scope. This ensures responses stay relevant and grounded in the right documentation.
Rapid impact
At the end of the one-week sprint, Valcon showcased RICO to ProRail users. The response was overwhelmingly positive. RICO delivered accurate answers on most questions, even handling complex or conditional information. It also provided direct references to source material, allowing users to verify the responses. The feedback from end users is that the experience is a marked improvement over the old search interface.
Next steps
In the coming months, ProRail will use the insights gained during the validation phase to advance the chatbot’s proof of concept and prototype.
While the results were promising, Valcon was also able to highlight a few other areas for improvement. The first was document management – enhancing the structure and quality of the source documents is likely to boost RICO’s accuracy even further.
Second, the probabilistic nature of LLM-based systems implies that 100% accuracy can’t be guaranteed. Users must understand the importance of validating responses against the original documents, which underlines the need for user engagement during development and testing. It also highlights the need for user training, plus a well-thought-out change management strategy.