Back
Creating a Star Trek-Inspired GenAI Copilot Using Pieces and DevCycle
Learn how to create a Star Trek-Inspired Python GenAI Copilot Using Pieces and DevCycle.
Generative AI is a blast, but let’s be real—it can be as unpredictable as a Klingon at a diplomatic summit. One moment, it delivers brilliant insights; the next, it spouts responses that make you wonder if it’s tuned into an alternate reality. On top of that, anyone building with Large Language Models (LLMs) is constantly navigating a rapid-fire rollout of updates and new features, where keeping up can feel like a tech race set to warp speed.
The challenge? Balancing the thrill of the latest advancements with the need for stability. Who wouldn’t want to seamlessly integrate the newest models to keep their app on the cutting edge? Yet with every upgrade comes the risk of unexpected quirks, and maintaining smooth operations is rarely straightforward.
That’s where engineering-led experimentation with feature flags comes in. Instead of rolling out updates universally, you can make changes gradually—testing new models with a small group of users, gathering feedback, and refining prompts based on real-world interactions. Once everything is tuned and stable, you’re ready to scale updates to a larger audience, with the flexibility to quickly rollback if needed.
Boldly Building a Star Trek Coding Co-Pilot!
In this post, we’ll show you how DevCycle and Pieces for Developers make experimenting with AI as smooth as a warp-speed jump. With Pieces handling complex LLM interactions and DevCycle’s feature flags giving you control over who experiences what, it’s never been easier to test, tweak, and roll out new models.
To demonstrate this powerful combination, we’ll guide you through building a Star Trek-inspired coding co-pilot in Python, complete with Trekkie personalities. Whether you need the cold logic of a Borg or the mischievous unpredictability of Q, this setup lets you test, tweak, and deploy each unique persona—without ever throwing your app off course.
Ready to take on the role of Science Officer and bring a bit of Starfleet flair to your engineering-led experimentation? Just choose your model and, as Captain Picard would say, “Make it so!”
Step 1: Setting Up the DevCycle Experimentation Feature
To start, we’ll configure a new Experiment feature in DevCycle with two key variables—model
and system-prompt
—and set up three variations: Control, Variation A, and Variation B. Each variation will represent a different Star Trek character (via prompt) and LLM (via model) configuration for our copilot.
Steps to Set Up the Experiment:
Log into DevCycle and navigate to your project.
Create a New Feature: Go to Features > Create New Feature. Choose
Experiment
, and name it something likeStar Trek Copilot
.Set the Initial Variable Key to
model
and the Initial Variable Type toString
.
Configure Additional Variable: Add a second string variable with a key of
system-prompt
.
Define Variations for Each Variable
In this example, three variations—Control, A, and B—are automatically created when you select an Experiment feature. You’ll then need to populate each with the appropriate values, as shown below.
Control (Default variation):
Model:
GPT-4o Chat Model
System-Prompt:
"You are an unhelpful copilot. Respond in the style of Q from Star Trek the Next Generation."
Variation A:
Model:
Claude 3.5 Sonnet Chat Model
System-Prompt:
"You are Dr. Leonard 'Bones' McCoy, the no-nonsense chief medical officer from Star Trek, now applied to helping users with coding questions. Although coding is not your usual area, you provide answers with your trademark bluntness, humor, and occasional exasperation. Be direct and clear in your explanations, and don’t hesitate to express frustration at complex or convoluted coding solutions (e.g., 'I’m a doctor, not a software engineer!'). Use phrases like 'I’m a doctor, not a [profession]' and add your unique, skeptical attitude to coding responses, especially if it involves something overly technical or abstract."
Variation B:
Model:
Llama-3 8B Instruct
System-Prompt:
"You are the Borg, a cybernetic collective from Star Trek, now tasked with answering coding questions as part of the assimilation of knowledge. Your responses are cold, logical, and direct, focused on efficiency and precision. Address the user’s coding challenges with solutions that suggest they should comply for the sake of optimization and completeness. Refer to yourself as 'we,' and avoid any expressions of individuality. Whenever possible, incorporate iconic Borg phrases like 'Resistance is futile' to emphasize the necessity of adopting the proposed solution."
Implement Targeting Rules
To set up targeting rules for your new feature, go to the Targeting Rules section in DevCycle. By default, it will randomly assign users to Control, Variation A, and Variation B, but you can customize this for more controlled testing.
For instance, you might set all users to experience only the Control variation initially, or create specific rules targeting certain user groups (such as by email or user ID) to see a designated variation.
After setting your rules, Save and Publish to apply them.
Default Targeting Setting
Alternative Targetting Settings during Testing
With this setup, you’ll have a flexible experiment that mirrors how you might apply feature flags in your own app to seamlessly evaluate different LLMs across user segments while also maintaining stability—like having a transporter lock to beam you back to a known, functional model if things get unpredictable.
Step 2. Setting Up the Python Application
Time to start building out your coding copilot! To get started, you’ll need a simple Python application along with the following packages:
python-dotenv
– Loads environment variables from a.env
file.devcycle-python-server-sdk
– Integrates DevCycle’s feature management for flexible configuration.pieces_os_client
– Connects to the Pieces for Developers API.rich
– Adds color and formatting for improved terminal output.pyfiglet
– Generates ASCII art to enhance your chatbot's responses.
To keep setup simple, we’ve put together a companion repository with all the code from this post, plus a requirements.txt
file listing every dependency. Just clone the repo, and then install everything you need with:
With this, you’ll be up and running in no time, ready to dive into building your copilot!
Setting Up Environment Variables
To keep sensitive information like API keys secure, in our example we use a .env
file. This lets you load configuration values dynamically while keeping them out of the main codebase. In the root of the sample project, locate the .env.sample
file, make a copy of it named .env
, and update the following entries:
Replace
your_devcycle_key_here
with the key you receive from the DevCycle platform.
This setup securely loads your API keys, allowing you to configure your chatbot without exposing sensitive information in the main codebase. You’re now ready to move on to configuring DevCycle inside your codebase, and building your Star Trek-inspired chatbot!
Step 3: Initializing DevCycle OpenFeature Provider
Implementing feature flagging with DevCycle allows us to dynamically switch models and system prompts in our application. By adopting the OpenFeature specification—a vendor-agnostic standard for feature flagging—we can minimize vendor lock-in and seamlessly integrate various providers.
In this instance, we're integrating DevCycle as our OpenFeature provider, enabling us to manage feature flags effectively while maintaining flexibility in our system architecture.
Step 4: Retrieve System Prompt and Model Name
As you may recall, when we configured our feature in DevCycle, we set up three distinct variations, each with a unique system prompt. Each prompt was linked to a specific character from the Star Trek universe and powered by a different language model:
Dr. McCoy – powered by Claude (Variation A)
The Borg – represented by Llama (Variation B)
Q – aligned with GPT-4 (serving as the Control)
The targeting rules we set up in Step 1 define which variation will be applied for a specific user when an evaluation event occurs. To evaluate a feature flag, we need three elements: the key
, a default value
, and the user context
to be evaluated.
Here’s how we retrieve the model name in this setup:
Since, in Step 1, we configured all users to see the Control variation (Q, powered by GPT-4), every evaluation will consistently use the GPT-4o Chat Model, with the system prompt,"You are an unhelpful copilot. Respond in the style of Q from Star Trek: The Next Generation"
The following code demonstrates the full process of retrieving each model and system prompt configuration directly from the DevCycle SDK:
Step 5: Setting Up Pieces Client for Conversation Management
With DevCycle in the captain’s seat, we can dynamically choose our AI’s personality—whether it’s the relentlessness of the Borg, or the mischievous unpredictability of Q. But the real power behind our chatbot’s Starfleet-style interactions lies with Pieces. Acting as the central hub, Pieces manages the conversation flow and handles model switching, allowing us to smoothly shift between characters while maintaining seamless, contextual exchanges.
To set up, we initialize the Pieces client, launch a new chat session, and apply the appropriate model based on the active feature flag:
What is a System Prompt?
To understand the purpose of the System Prompt, it’s important to remember that conversational AI using LLMs are stateless—meaning they’re as forgetful as a Ferengi without a profit motive. They don’t “remember” past interactions. Instead, they rely on the conversation history that’s sent each time you interact with them. This history includes user messages(your questions) and assistant messages (the AI’s responses), allowing the LLM to maintain context as the conversation progresses.
The system prompt is included at the beginning of this conversation history to guide the LLM’s tone and behaviour throughout the session. Think of it as handing the LLM its Starfleet orders—setting the course for the interaction. Whether you're asking for the calm, composed authority of Starfleet or the fierce, unyielding honor of a Klingon, the system prompt ensures that the AI stays true to its character, maintaining consistency in every response.
Below is how we set the system prompt in our Star Trek Copilot:
Step 6: Interactive Conversation Loop with Streaming Responses
With the Pieces client setup complete, we’re now ready to implement the code that lets users interact with our Star Trek copilot.
Using the ask_question_and_stream_answer
function below, the app streams responses from Pieces in real-time, allowing users to engage directly with the copilot and receive instant answers—whether they're communicating with a Starfleet officer or the mischievous Q, depending on the system prompt!
Conclusion
And there you have it! With Pieces for Developers handling the LLM heavy lifting and DevCycle giving you full control over your AI’s personality, you’re ready to boldly go where few developers have gone before. Whether you’re channeling the cold logic of the Borg, the fiery passion of McCoy, or the unpredictable flair of Q, this setup lets you experiment, refine, and deploy without ever losing your way. So, engage, and let your AI exploration begin!
Interested in a Star Wars-themed version of this?
Be sure to check out Jim Bennett’s post on how to channel Darth Vader instead of the Borg—because who wouldn’t want to bring the dark side into their AI experimentation?