The Ballad of CySat-1 - Tales from an Undergraduate CubeSat Team

Ultimate Steve · December 20, 2024

Hello, a few of you may be aware that I spent two years working on this little guy:

This is CySat-1, a 3U CubeSat designed to measure Earth's soil moisture using a software defined radiometer. The long end of that umbilical cable is currently hanging up on my bedroom wall, and everything else is currently orbiting the Earth as a nonfunctional and quite expensive brick (or maybe it has re-entered by now, or maybe a miracle has happened and they made contact but didn't tell me, but I'm going off what I last heard).

In addition to being a nonfunctional brick, it is also a little box of pain and suffering and joy and relief and learning. I have a lot of stories to tell about this thing which has repeatedly tested my technical ability, my sanity, my understanding of my own self, and the limits of my emotions, among other things. I might eventually write a book about it, but that's a ways off, and I figured since a few of you have expressed interest, that I would put all of those stories in one place. I'll update this thread with new stories as I remember them or as I find the time and energy to write them all out.

I hope to describe here a lot of the herculean (for undergraduates) technical challenges we had to overcome, and possibly some of the emotional challenges I had to overcome, in the pursuit of getting this thing up in the sky. If you've read my mission reports, expect a similar feel, but this will be my first time writing this way about events that actually happened, so it looks like this loaf of bread sized angry box of silicon and copper might still be requiring me to learn new skills haha.

While I will not use names, it is reasonably possible to put a name to the alias, I will ask that you don't do that and that you certainly don't contact anyone involved.

There are two "Overview" stories to provide the context necessary to understand the rest of the stories. If you have any questions, please let me know, talking about this thing is one of my favorite things to do.

Acronyms and Glossary

SDR - Software Defined Radio (in general) or Software Defined Radiometer (referring to the specific instrument on the satellite)
EPS - Electrical Power System
ADCS - Attitude Determination and Control System
UHF - Ultra High Frequency, usually referring to the satellite's UHF Transceiver
FPGA - Field Programmable Gate Array, sometimes referring to the satellite's SDR instrument
LNA - Low Noise Amplifier
OBC - On Board Computer
UART - Universal Asynchronous Receiver/Transmitter - A serial protocol for exchanging data
I2C - Inter-Integrated Circuit - Another serial protocol for exchanging data
GNU Radio - A software defined radio program
C&DH - Command and Data Handling
M2I or M:2:I - Make to Innovate

I will note that all of this is from my best recollection of events, some of which happened over two years ago. There's no way I'm gonna get everything 100% right. I will probably put a number of events out of order.

Story 1 - Project History and Hardware Overview (So you know what I'm talking about when I reference a particular part of the satellite)

Spoiler

CySat-1 is a 3U CubeSat. It is named after Iowa State University's mascot, Cy the Cardinal, and was developed, assembled, and tested over the span of many years, usually by a team as part of Make To Innovate (M:2:I or simply M2I). M2I is an aerospace department managed (but not aerospace major exclusive) project based learning course/class hybrid with several cool projects and teams within it. We have a few rovers, a few rockets, a few airplanes, a balloon team, and last I heard a friend of mine was trying to get a phased array radar team started as well. Lots of cool things. And, of course, CySat, though CySat was moved to be its own thing during the last semester of its development (there will be a story about this).

I don't know when exactly the modern version of the project started, but the University has been trying to launch a satellite in some form since the early 2000s and the modern version of the program started some time around 2015. There was some work on a 1U CubeSat prior to this time, which would have gone up there and beeped, but NASA rejected it as it had no scientific value. A significantly more complex (too complex for the team that was assembled as it would turn out) 3U version with attitude control and a scientific payload was designed, and that is the version (with significant changes) that ultimately launched into space on Cygnus NG-21 in August of 2024, and was deployed from the ISS a few weeks later.

To my knowledge, no communications from it were ever received. I have not had a stable channel of communication with the current team so most of what I can do is speculate, and It may not ever be determined what the exact cause of failure was, but it was likely during the startup sequence. One possible cause of failure is that due to some bad power usage assumptions and possibly an unfortunate ejection angle, CySat-1 never maintained a positive rate of charge and died waiting for it to have enough power to deploy its antenna. Another theory is that it reached antenna deployment, but the battery voltage threshold was set too low and the current spike from attempting to deploy the antenna triggered undervoltage protections and sent the satellite into a boot loop. Another theory is just that the OBC locked up somehow. Yet another (likely incorrect) theory is that the 3D printed battery cover may be worse at retaining heat than the aluminum one it replaced, and the battery heaters had to work significantly harder than expected, so CySat used all of its battery up trying to keep its batteries warm.

There are also 99% disproven theories that the miniscule draw of the beacon caused the satellite's rate of charge to be negative, and that the new battery cells did not have heaters at all, but these are highly unlikely if not outright impossible to have been the cause, but I thought I'd list them for the sake of being complete.

Given how many plausible causes there were, I have concluded that I put a lot of time into the main operating phase of the satellite and not enough time into the startup sequence. Note that I didn't say "too much time" in regards to the operating phase, that needed more time too, though more time wasn't really an option. Generally, I think the satellite could have benefitted greatly from another semester, but that's a story for another time.

CySat-1 consists of several major and minor subsystems, many of which use PC-104 connectors as the main electrical and communications bus.

Major Subsystems:

The OBC - The heart of the satellite. A small ARM Cortex based computer built by Endurosat. It runs C code that tells the rest of the subsystems what to do and when to do it.
The UHF Transceiver - Responsible for sending data to, and receiving commands from Earth. The most problematic of the subsystems in my opinion. Also built by Endurosat.
The UHF Antenna - The satellite's deployable antenna. Also built by Endurosat. It can only be deployed once and it was programmed in a way where talking to it is difficult - Please give me one button that says "Deploy", not a complex integer-within-string based command with an ambiguous response that has to be sent to a completely different subsystem over 2 separate communications protocols.
The EPS - Handles power distribution, toggling, and storage for most of the satellite. Also built by Endurosat. It has battery cells, battery heaters, six general purpose outputs, five power buses, six inputs for solar panels, three button switches, a remove before flight pin, and probably a bunch more features I forgot.
The ADCS - Responsible for determining and controlling the satellite's attitude. Built by CubeSpace, it has a lot of sensors and devices, including a GPS receiver, a magnetometer, star trackers, sun and Earth sensors, a reaction wheel, and three magnetorquers.
The Payload - The Software Defined Radiometer designed to measure Earth's soil moisture. The professor in charge of the program CySat is part of constructed a software defined radiometer for laboratory use for his master's thesis. The idea, as far as I understand a master's level electrical engineering project (which is not very much) is that computers are now fast enough to do digital signal processing on radio signals, and doing signal processing using software has several significant advantages over using traditional analog components. This thesis explored the viability of using Software Defined Radio technology to create a radiometer. When CySat-1 was upgraded from 1U to 3U, it was chosen to put a miniaturized version of this instrument on board. I'm no electrical engineer, but the radiometer listens in the protected band of 1400-1427 mHz (corresponds to radiation emitted by hydrogen, nobody is allowed to transmit here as it is an important band to astronomy) and eventually spits out a number with the units of Kelvin corresponding to how much radiation in that band is being observed. The exact math as to how it happens is a little beyond me. This instrument consists of four pieces:
- The antenna (not much to say here, it is the one thing on the whole satellite that never experienced a single change over the two years I was there as far as I am aware) (in the picture it is the gold and black circles near the top, it is pretty obvious)
- The RF chain - There were many versions of this but the final version consists of a bunch of SMA components from Mini Circuits connecting three Low Noise Amplifiers to each other, the antenna, and the FPGA. Originally designed by me but very little of my efforts made it into the final design as it was revised into nothingness by people more knowledgeable than me.
- The FPGA/The SDR - An Analog Devices RF focused board that has, I believe, a CPU, an FPGA, and several analog to digital converters, that comes packaged with its own custom version of Linux tailor made for working with software defined radio. It runs two CySat relevant programs, the GNU Radio flowgraph that captures the scientific data, and a Python script that deals with handling the transfer of commands and data between the SDR and the OBC.
- The Carrier Board - There are at least four carrier boards in the story of CySat, but each one connects to the FPGA and provides it with I/O interfaces for data and power. There's one really large one with HDMI and several USB ports used for development and testing, there's one really small one with only a UART and power connectors that ultimately didn't work, there's one medium sized off the shelf board that was cut in half in an effort to make it fit but it didn't work for obvious reasons, and there's another medium sized off the shelf board that ultimately made it on to the satellite after substantial modifications to both the board and the rest of the satellite, but that's at least two or three stories on its own.

Other Subsystems:

The Boost Board - A bit of a misnomer. Sometimes also known as the Electronics Board. A square of prototype board that was originally created to house the boost board (a story for another time) but ended up playing host to at least five or six separate components and connections we found out we needed throughout development. Built by a variety of manufacturers, to an extent partially designed by me.
The solar panels - Lots of stories about these but the ones that ended up on the satellite are static solar panels built by Pumpkin Space. Not a lot to say there. They are in weird sizes and locations but that's its own story.
The Structures - There are three major metal brackets holding up various components, connected on the corners by four corner brackets, and there are four threaded rods running the length of the satellite, one per corner, to which the three major brackets and a few other minor brackets are affixed. There are also many spacers involved. There are probably some other parts I'm forgetting as I was not really involved with the structures, the most I did was failing at finding a creative way to get one nut to attach properly.

Ground Systems:

CloneComm front end - A Python GUI and outgoing packet encoder that is used to command the satellite and, to an extent, display received data from the satellite (not all functionality was completed in time but the important stuff is accessible in one way or another). Written nearly entirely by yours truly.
CloneComm back end - A GNU Radio flowgraph that handles the decoding of incoming packets from the ground SDR and the modulation of outgoing packets. The back end and front end talk to each other over the internet using ZeroMQ, a black box TCP library. I made it myself but the downlink half uses a prebuilt solution so I only had to go through a little more than half of the effort it would have taken to actually write the entire thing.
The ground SDR - Talks to the CloneComm back end over ethernet and also to the ground antenna
The ground station antenna was the one thing not done by the time I left the team, presumably other software is needed to tell the antenna where to point but I never got the chance to tackle that problem

The only things in this list that I didn't have to solve problems with are the SDR antenna, the ground station antenna (because it didn't exist), and the structures. Everything else I solved at least one problem with. CySat has given me a lot of invaluable experience with trying to get 17 bajillion computers from a gazillion different manufacturers to talk to each other nicely, among other things. That's usually how I describe it in one sentence to someone who doesn't know much about what I did.

Story 2 - An Overview Of My Time On CySat-1

This generally goes through a summary of events as they happened - Individual events can have stories behind them as long as this entire section, this is just an overview to show the general timeline and how events may have fit together. This is significantly longer than I hoped it would be.

Spoiler

Before Me

Before my time, to my knowledge, a competent and informed team worked on CySat-1 until early 2020, when COVID forced a temporary shutdown of the project. Due to the abrupt cutoff (and this is a little bit of speculation on my behalf) there was not adequate communication of what was and wasn't done (or in some cases, there was, but it was buried in the not very organized documentation folder and was not discovered until much later). The satellite sat, disassembled, in a small room in the back of the M2I lab, gathering dust, for many months. CySat was revived in, I believe, 2021, and from my knowledge, the plan was that the professor in charge of the project would hire a few grad students to finish CySat-1, and the main M2I team would start work on CySat-2's primary instrument, a more manageable radio receiver meant to collect data from an under-explored part of the radio spectrum. Throughout, I believe, Fall 2021 and Spring 2022, they completed a prototype of the instrument and flew it on board a high altitude balloon, but it collected no data as the positive and negative terminals of one of the components had been wired backwards. This was not caught by anyone on the team or the professor in charge so it isn't a knock on anyone just in case you thought that. I've certainly made multiple similar mistakes.

For some reason or another, the grad students meant to finish CySat-1 never got hired and CySat-1 still lay in the corner in pieces collecting dust.

Fall of 2022. My friend, who worked on CySat-2's instrument, had me sub in for her as "mission control" for the balloon flight, as the flight was many hours long and she had a class. The job, if I remember correctly, consisted entirely of making sure one number didn't exceed boundaries, which it didn't, because the instrument didn't ever properly turn on unfortunately.

While we were there, she talked to me about joining CySat. I objected as I felt that I was in no way competent enough to work on an actual satellite. This was partially because I had some impostor syndrome at the time, and partially because I genuinely had no clue where to start. My entire M2I experience thus far had been one semester on the least prestigious project (A KSP Simpit) where I cut one piece of wood, resolved one minor control issue in Python, and failed at diagnosing an Arduino wiring issue. And I was only just starting my junior year after a very difficult time adjusting to the workload, lifestyle, and emotional requirements of college.

But she convinced me to join anyway, and if you're reading this, thank you, I loved it so much and hated it so much on alternating seconds, and I learned a lot thanks to your invitation.

So, I signed up for M2I, and in Fall 2022, I joined CySat. In the first meeting, it was explained that about half of the team would work on CySat-2 and half of the team would finish up CySat-1, which was scheduled to be handed over for launch at the end of the semester. I was intending to do CAD work on CySat-2 as it felt like the only thing I would actually be competent at, but then the person running the meeting asked me if I would be willing to finish CySat-1's software first. I was told that "Oh it has been sitting in a corner almost completed, it just has to be put together and a few small software tasks need to be done." I said I'd try, though I felt that was very outside my comfort zone. This was because my programming experience at that point had been limited to some Javascript on Khan Academy, a decent amount of MATLAB, a tiny amount of C++ for a basic Arduino project or two, and failing at learning Python because the introductory Python class was so poorly managed (It was a lot of "Copy paste this and don't ask questions").

And when they said "Oh yeah it just needs one or two things finished?" Yeah. Well. I think you can guess what the reality was!

The Early Days

This is meant to be a short summary and I'm failing at that, so I'll talk more about this later and I'll try to keep it short here.

Have you ever put a group of undergraduates (I don't even think any of us were seniors, and there were probably as many underclassmen as juniors) with no satellite experience in a room with a pile of satellite parts, given them a github link and a documentation link, and told them to make it work, with no contact with the previous teams and minimal assistance from people that knew what they were doing?

Well, you probably haven't, but the result was about what you would expect. We didn't really know what we were doing. At the time I was one of the programmers on the programming team, we were tasked with finishing CySat-1's flight software, which was written in C. Only one person on the team had any significant experience in C whatsoever. At best, a few of us had done a few lines of C++ on a tiny Arduino project before. And what's more, I can't speak for the others but I'd imagine the experience was similar, I was so terrified I'd break the expensive complex satellite that it took me a long time to get over the fear of touching it and messing something up.

It took us like half of that semester to figure out how to turn the OBC on.

When we did, we finally were able to start working our way through trying to understand the flight code and what still needed to be done, and while I had initially been assigned to the EPS, I turned my focus largely to the UHF transceiver, partially because nobody had been able to make it work before, and partially because the handover-for-launch deadline was coming up and we had not figured out how to get the satellite to do a single thing besides flash its lights at us. If I could just get the satellite to deploy its antenna and turn on its beacon, that would put us ahead of like 50% of university cubesats. It is a bit of an old statistic, so it might be outdated, but the statistic that everyone quoted was "Half of university cubesats are launched into space and never heard from again."

Each passing week we kept encountering problems that we didn't know how to solve that would stall progress, and before I knew it, the end of the semester was upon us. I had made some progress with the UHF, I was able to use the OBC to tell it to do some basic commands, but I was not able to get it to send data home or receive data. I had not made any progress with the antenna. My goal at that point was just to make the beacon (simple preset radio message that loops once in a while) turn on after the satellite was deployed. That was it. The antenna likely wouldn't even deploy as I couldn't figure that out.

The deadline for vibrational testing (after which no hardware changes could be made) was upon us. After this time, the various USB ports used to debug the various subsystems would be inaccessible, though we concocted a plan that would at least leave the programming umbilical accessible, and software development would continue after vibe test, as futile as it would be.

The entire structures team labored deep into the morning in an attempt to fully assemble the satellite. I wished them luck and went to bed.

I woke up to a defeated team, as they had been unable to get the satellite to fit together. It has been a while so I don't remember the details of why. We had missed vibrational testing, and as a result, we were going to miss our handover date, and it was looking like this was the end of the over 5 years of effort towards getting CySat-1 into space.

The Second Chance

But then someone at NASA took pity on us and delayed CySat-1 to another flight instead of demanifesting us, which I had been told was very unlikely would happen (I believe the project had already been significantly delayed by that point but I'm not sure about this). All of a sudden, we had another shot, and we had a semester and a half to do it.

The leadership was reorganized - CySat-2 was placed on the backburner, all cylinders would be firing on CySat-1. In the shuffle I was promoted to lead of the testing and integration subteam (which was basically just software testing and software integration, it is kind of awkward when interviewers see the title and ask me about all of the hardware tests I performed when I have to stretch to even come up with a single simple hardware test), which I did not at all feel qualified to do, as I had zero leadership experience. I had only managed to get the satellite to beep at us intermittently, but that was more than anyone else had been able to do. So I gave it an attempt.

During that second semester, the first priority was to figure out what actually needed to be done. This ended up being largely, among other things:

Test the software links between each subsystem to see if they worked, and add each command to the big case switch statement so it could be activated from the ground
Pretty much all of the communications software (nobody had yet managed to establish even one way communications with CySat-1 outside of beacon testing)
Communicate with the payload

And on the structures team, the tasks involved redesigning the solar panels (there are like three stories here) and to an extent the main structures of the satellite, as well as assembling the SDR's carrier board.

I had three people who were new to the project and one person who had been there last semester on the testing and integration (software) team, so I assigned everyone to a subsystem. Each of the three sophomores would focus on testing and refining the parts of the command and data handling software for the EPS, ADCS, and UHF respectively. This was meant to be a quick introductory task to get them familiar with a certain subsystem and in the right mindset to try things, get comfortable working with the satellite, and also make forward progress so I could spend my time on more complex problems. However it ended up dragging on as they ran into many of the same pitfalls I did during my first semester. Most of the command testing did end up getting done, but they got stuck a lot and often would not seek help until the weekly meetings. And then I'd help, and sometimes the problem would be something I didn't adequately communicate or they didn't remember, and sometimes it would be a huge Frankenstein's monster of a problem that the entire team had trouble solving.

All in all, my first attempt at leadership was a bit of a flop. I would like to sincerely apologize to those three. I'd like to think that I provided a better environment for them than the environment I had joining the team, but I had not yet learned how to help someone through a complex task without taking the reins completely, and that really showed.

I assigned the other senior member on the team to the SDR, but we soon found out that the hardware to allow communications was still in development, which was in the hands of the professor. He pivoted to helping with the UHF instead.

I assigned myself the task of trying to crack UHF communications as I had become a little familiar with the UHF and it very much appeared to be the most critical remaining task. And there was still that mindset of "If nothing else works, we should have communications working so we don't fall into the 50% that don't phone home at all."

The long, LONG story of communications is for another time, but in short, it was very difficult and previous teams had been barking up the wrong tree. We had been led to believe that talking to this UHF was possible with standard amateur radio hardware, but in the end we had to pivot to using software defined radio due to the (extremely poorly documented, to the point of being misleading) packet protocol used by the UHF. At the very end of the semester I did manage to get the downlink partially working, though uplink was looking to be significantly more complicated. I was very proud of the demo I had for that year's expo (every team shows off their progress in one big room at the end of the semester), it was just the satellite transmitting the UHF's temperature on a loop and it showing up on a computer monitor, failing and having to be reset every 10-15 minutes. But that was more progress than anyone else had made on comms in the past six years, so I'll take it.

That semester, while nearly infinitely better than the previous semester, did not see a lot of forward progress, as none of those three bullet points were solved (not to mention the numerous smaller tasks like the antenna deployment that hadn't been resolved either) but we, as a team, began to make forward progress and grasp the true scope of the problem.

In general I felt like instead of being able to focus on the biggest problem and have the less experienced people work on the smaller problems, I spent a third of my time writing tasks, a third of my time helping other people with their tasks, and most of the remaining time fixing other things, so I was not able to put as much time into the hard stuff as I wanted to.

Summer

Due to the lack of progress, and the looming handoff date halfway through the Fall semester, the professor in charge offered to hire some of us to stay over the summer, with the promise of time dedicated to solving those hard problems without all of those weekly reports and stuff. Me and one of the members of the structures team took him up on that offer. Unfortunately, getting things set up took significantly longer than expected - the payment system for one. My other part time job doesn't offer enough summer hours to break-even with housing prices so I near about broke down in frustration one day over the possibility of paying money to not get to work on the satellite instead of getting paid money to work on the satellite. We had also moved to a clean room deep in the basement of the aerospace building (previously we had been in a normal lab) and getting us keycard access to that room (and training) (and purchasing new clean room gear) took a long time, so it was like a month before we could really get started.

A disclaimer here - the clean room was more like a somewhat-cleaner-room. The cleanliness standards even in the clean room would likely not be acceptable for any actual professional satellite. We were aware of that and we were doing the best with what we had, and it was still a significant step up from doing things in the lab, where things were frequently covered in a fine layer of sawdust.

(In fact, there's a lot of things like that on this project. So many things I'd never do in a professional setting, but the circumstances left us with no other obvious options.)

In addition, the clean room ran off of a strict buddy system - we could only go in to work on the satellite when both of us were available and there were strict times when we had to leave (I would like to apologize to my colleague for frequently pushing up against her deadlines). As we had differing schedules and vacations, this was a problem, as there were long stretches when no work could be done. Well, technically the lab also ran on a buddy system, but there were so many people in there that it was almost never an issue. And as the clean room gear was disposable (well, it was until we started to run out) and ordering more was a little bit of a pain, there was additional pressure to make each trip into the clean room count.

Initially I thought of this as a major hinderance but it taught me a lot about time management and how much I really needed to have physical access to the satellite to do. I came up with a system. I'd spend half an hour before going into the clean room making a list of all the things we needed to test. We would test each thing, when we ran into a problem, we'd spend a small amount of time trying to troubleshoot it. If that went nowhere, we'd gather the data on it, and move on to the next thing, and make a list of things to look into for our out of the lab time. Then until the next clean room time, I'd work on fixing the software to test in whatever location. Usually in one of the vacant bedrooms in my dorm/apartment, or in the library, or in the lobby of the Aerospace Engineering building. Wash, rinse, repeat. This did end up with a lot of the smaller tasks getting done when they otherwise wouldn't have, and allowed me to think more deeply about any given problem without as much time pressure.

That summer, we accomplished three major things, each of which deserves its own story. We established reliable two say communications with the UHF (This was a herculean task and the first sign that we had started to solve it was my happiest moment on the entire project and a solid candidate for a top 10 happiest moment of my life). We solved a long standing issue with the ADCS intermittently deciding to not work (the root cause was extremely stupid but the journey it took to get there was a really long roller coaster). And we started seriously trying to get the payload online. I wouldn't say there were any major breakthroughs in that department, but we got the ball rolling and started to understand the scope of the problem better. This is in addition to a lot of smaller things and problems within those major problems.

But I felt like I had actually accomplished something.

Oops it all fell apart again

The third semester, Fall 2023. The handover deadline was halfway through the semester. Unfortunately, the three software juniors had not returned for another semester (no doubt in part due to my poor leadership performance during the previous semester), though there was no shortage of manpower, as this semester I had a new junior that was able to hit the ground running, my senior was still there, and the structures team initially didn't have much to do, so many of them were transferred over to software temporarily (though I might be confusing this with the previous semester).

With communications solved, the focus shifted away from "Make it work once" and to "Make it work reliably and repeatably." The Antenna still had to be solved, and the SDR still had to be solved, but that was in the professor's hands, and he knows what he's doing, though it is getting awfully close to the deadline to figure out how to get the SDR to communicate with the OBC.

So the first bit of that semester was spent trying to use a terminal program as an intermediate, talking to the SDR as if we were the OBC, but our efforts were soon hampered by the SDR's bootable SD card maybe but maybe not deciding to corrupt itself. The three of us spent a lot of time trying to rebuild it, only to discover that we had a backup hidden deep within the documentation folders with an unassuming name.

Well, this deserves its own story, but the professor had been working on a carrier board for the SDR small enough to fit in the satellite. It was finally handed over to us for final assembly and testing in what felt like the eleventh hour. If it didn't work straight out of the box, we had no instrument.

I wasn't there for this, but we plugged it in, and it drew several times more power than we thought it would and it didn't turn on.

There was a massive short somewhere and we didn't have time to figure out where, and we certainly didn't have enough time to build a new one, so it was decided that we'd send CySat-1 up with a non-functional instrument and attempt to demonstrate communications and attitude control instead.

Thus I pivoted to ensuring that the communications systems were robust. The backup mission was that we would load the OBC's SD card with memes and attempt to downlink them. Doing this during ground testing helped me really ensure that the satellite's communications systems were robust. Much of the development of the ground station software happened during this semester as well. Attitude control, we hadn't taken a super close look at as it looked largely complete and we were hoping that the previous teams had properly done their homework.

During this time, I believe that the structures team was largely working on a new solar panel layout, but that's another story.

I do not remember what the outstanding issues were, but we once again had to enact the plan of continuing software development after vibe test.

Final assembly of the satellite, well, that's its own story, but suffice to say, through the herculean effort of the structures team, CySat-1 was dragged, kicking and screaming, into being fully assembled, with unfinished software, a nonfunctional instrument, an antenna that we weren't sure would deploy, solar panels bulging out due to lack of cable management channels cut in the brackets, and one solar panel that may or may not have been epoxied in place mere hours before the satellite inspectors showed up due to misaligned screw holes or a lack of a way to hold the nut in place to screw into. But hey. At this point we were still the underdogs. Getting the satellite to this point was a victory in itself.

The satellite had to be disassembled the night before handover because the remove before flight pin - like the most important component on the satellite as it protects the ISS - had been wired up backwards. But that is another story.

Now, why exactly the satellite didn't get accepted for launch is a story with so many painful-in-the-present but hilarious in hindsight beats that I'll save that story for later, but suffice to say, the satellite did not make it to space due to an issue with its structure. To my knowledge, the epoxied solar panel, the bulging, and the slightly dangling solar panels (due to our inability to get every screw in) were not actually showstoppers. Though they may have been, it is just that there was one extremely large, and in retrospect, kind of funny showstopper.

Back to the Drawing Board

NASA, somehow, decided to give CySat-1 yet another extension, and we were told under no uncertain terms that this would be the last one for real this time (though we had been told that the previous few times, or at least I had, what the higher ups had been told may have been different).

The structures team redesigned nearly the entire structure from scratch, attempting to make the satellite easier to assemble. At the same time, the entire satellite was redesigned in order to just barely fit the smallest off the shelf carrier board that was made for our SDR. This was something that had previously been looked into but had been deemed not feasible, if that's a measure of how tight of a fit it was.

Originally I wanted to use the last month or two to perform long duration tests of satellite operations that would have incorporated artificial day and night cycles, software longevity tests, on purpose trying to break the software to find its weaknesses, and more focus on the satellite's startup sequence, but the radiometer ate all of my time.

It fell to me to figure out how to deliver power and data to the new carrier board. This is like 3 or 4 stories worth of content. It was the second hardest problem I've had to solve next to getting two way communications to work.

But suffice to say we got power working, communications proved very difficult, and with like 2 weeks of notice, surprise! We're doing a test flight of the new radiometer design on a high altitude balloon as a collaboration with the high altitude balloon team and another class!

Originally we wanted to command the radiometer using the balloon's telemetry system but that proved too complicated to do under such short notice. The balloon only had an I2C with a connector that couldn't easily be changed, and a USB C, and the SDR had a UART over USB mini A as well as a normal USB, and while we tried, apparently USB actually has 2 ends (host and device) and we had 2 of the same so we couldn't easily adapt them, though we tried special adapters. So we just set the Radiometer to turn itself on upon startup and keep recording until its batteries drained.

This turned out to be a good thing, as the balloon's communications system failed, and all three redundant tracking devices failed (worthy of its own story), so we wouldn't have been able to command it anyway. The flight hardware got lost in a field over Christmas break and was eventually turned in by some farmers.

The Final Stretch

The final semester, CySat was downsized from a fully fledged M2I project into just me and the project manager. This was done without telling me, and thus I signed up for the wrong class, and that was a bit of a pain to resolve.

I understand the logic behind this decision - The two people that know the most are now able to go full force with no pesky weekly reports or task delegations and task assistance to hold us back. But it is a decision I do not agree with.

Firstly (and comparatively minorly), the satellite was now in the hands of two people and mistakes made by one person would only have 1 chance, if that, to be caught.

Secondly, this was essentially, in my mind, locking the satellite into that launch date. Not that NASA was particularly likely to give us yet another extension, but in the event that the satellite was not ready, we go back to square one, a bunch of undergraduates with no satellite experience and no mentorship from previous teams to help them along (we were both graduating seniors). As it is, not having team continuity for setting up the ground station and operating the satellite was bad enough, but this move threatened to undo all of the efforts we had made to cultivate institutional knowledge and team continuity. That was finally actually starting to take shape. The team was finally starting to fire on all cylinders that third semester, and this move would ensure that I was no longer able to rely on them for help.

However, the decision had been made. It was now or never. Me and the code and nothing in my way but technical challenges and my own weaknesses. One semester until handoff, and whatever we ended up with, we had to ship, as CySat would not be getting another chance from NASA, and even if it somehow did, finishing the satellite would be fall to a fresh, discontinuous group, similar to the situation that started this whole mess in the first place, which would likely end the same way it had the first time.

Nearly a decade of development, thousands of manhours, hundreds of thousands of dollars, and getting it all across the finish line was on two sets of shoulders. No pressure or anything.

The first thing we did was ensure that our instrument survived 2 weeks of Iowa winter. It did, it worked fine, so we pulled our data, and... Complete junk.

This took a while and is enough for its own story, but it turned out to be because of a redundant bandpass filter that was way noisier than we thought it was (either through physical damage or simple miscalculation) so we simply removed it and it started working again. It wasn't caught in calibration because sometimes it behaved in a way that it was supposed to and sometimes it did not, and we were not able to figure out what was up and convinced ourselves it was probably fine. But again, that's its own story.

The bulk of that semester was integrating the SDR into the satellite. Power, as it turns out, was a huge issue, as we hadn't actually gotten accurate numbers for the Radiometer's power draw until recently, and it turns out it was right up against the current limit of what the EPS 5V bus could provide, and also, that much current would drop the voltage so low it would sometimes trip the undervoltage protections on various components. This led to a particularly nasty failure mode where the OBC would shut itself off to protect itself, but the SDR is unable to turn itself off (at least without significant hardware modifications which we did not really have the time for) so the satellite would just enter a death spiral, unable to shut the power hungry SDR off.

In the end the power issues were solved with extensive software checks. The communications issues, though, this section is long enough and that's easily its own story. They were resolved with weeks of significant effort, though it ended up being in perhaps the dumbest way possible, but spoilers.

There was one day in particular where, for perhaps the first time on the project, I finally had nothing but good news to report to the professor during one of our meetings. I was finally optimistic about our chances. Half an hour after the meeting I was back in his office telling him I accidentally ran six amps of current through the OBC and that at least one mission critical system was damaged.

I completely recovered from that, bypassing the damaged system in less than 48 hours. That's a story for another time, but you know that scene in the Lego movie where Emmet finally realizes he is truly a master builder and goes "I CAN SEE EVERYTHING!" and builds things he was never able to build before? That's what that 48 hours felt like. I felt powerful. I had messed things up yet again. But I now knew the satellite in and out so much that I was able to bypass the issue and write an entirely new file handling system overnight. For basically the first time on the project, I felt like I was worthy of being there.

But then it was time for final assembly for vibe test (again). Once again, software development was not complete so we had to do the "last minute software development with an umbilical cable that will be cut off later" thing for a third time. Aside from one issue which technically resolved itself, the satellite was assembled perfectly after so much effort from the project manager, it passed vibe test, and it was cleared for handover.

During those last few weeks, I finally figured out how the antenna deployer is actually supposed to be commanded, figured out that the ADCS loop was never working because of a few misspelled variable names, fixed a few C&DH bugs, and was forced into integrated solar panel testing (the satellite was incapable of being charged via USB with the solar panels attached).

By this time, I had been able to, for weeks, if not months, successfully command the satellite to do things remotely. I'd press a button on the ground station software GUI (which I'd made) (If you're reading this, I'm sorry, I had to restart GUI development as Tkinter turned out to not compatible with the multithreading/multitasking libraries necessary for simultaneous reception and transmission, so all those buttons you made in Tkinter sadly went to waste) and it would get encoded and sent over the internet to the ground station backend (which I'd made) and it would get sent over the air to the satellite's interrupt based command and data handling system (which I'd fixed) and then it would do its thing. Sometimes this would be executing long strings of health checks (which I'd made half of) of the various subsystems, sometimes it would be telling the SDR to take a scientific measurement. I wrote the control code on the OBC side and finished and tested the control code on the SDR side. The OBC conducts several power checks, turns on the 2 parts of the SDR, makes sure they are on, and tells it to start recording. It does its thing. The OBC asks for the data, the SDR transfers the data to the OBC, which then turns off the SDR. The ADCS would concurrently be doing its own thing in a separate RTOS task and the two tasks would not interfere. The ground station then commands the OBC to return the data, and the satellite will break the data into packets and scramble them to ensure integrity (a system which I designed), send them down to Earth, and will reassemble those packets into the original measurement and tell you which packets are missing that you may need to retransmit (I did about a third to half of this system).

And this was a regular occurrence, I'd push two buttons and all of those systems would do their thing perfectly almost every time. I have a folder full of probably over a hundred radio downlinked files from the satellite's testing phase on my laptop. The main operating phase of flight was working and it was working well (in short term testing, at least).

One thing however that got pushed out of being possible was proper long term testing of the satellite and focus on the startup sequence. The most that happened was that the satellite was left on overnight in the solar chamber to charge via solar a few times (one time I talked to the satellite from my bedroom at home), and after one of those times, the OBC had locked up for some reason which we never concretely determined as by that point we had mere days left and had already pushed the deadline as far as it would go.

In those last few days we added a backup system to deploy the antenna in case the existing system did not work.

On the final day I did some final stress testing of the power watchdog task (designed to restart the satellite if the battery voltage dropped too low) and we determined that it was possibly the cause of the unexplained freeze but we couldn't figure out why, and even if we did, we did not have enough time to verify the fix (the bug could take hours to show itself and we had like 2 hours left). So we simply disabled the power watchdog, not ideal, but it was one of several safeties in place.

There were bits of the code that were untestable as they controlled one time use components like the antenna and magnetometer deployment systems. The final hour pre deployment was spent going over the satellite's startup sequence with two additional sets of eyes on the code. With trembling hands, I uncommented the "Code RBFs," compiled the code one last time, and uploaded the final flight code to CySat-1. I made sure it turned on (and didn't immediately crash) and immediately shut the satellite off. The programming umbilical was cut, and CySat-1 was packed into its suitcase, where it would be delivered for integration with the Cygnus spacecraft, which would be delivered for integration with a Falcon 9 rocket, and finally, it would be launched up to the International Space Station, where it would be loaded into a CubeSat deployer, and ejected into space.

The satellite really could have used another semester. Like we were discovering new bugs LITERALLY in the final hour before handoff here. If we had any shot of another extension and if I believed at all that the incoming team had a shot of hitting the ground running, I would have advocated as hard as I could have for more time, but as I've said a number of times, that was it. We had already pushed our luck with time. We were in triple overtime and we had to make a hail mary. I knew the odds of it working were pretty low but I thought we had a good shot at at least getting a beacon back, with my personal celebration threshold set at successfully pinging the satellite once. But that decision had been made months ago, there was never going to be another semester. I played almost as best as could be reasonably expected to play with the hand I had been dealt. The die had been cast, and it would be months before I got to look at what had been rolled.

I would also like to mention that this entire time the two of us were also juggling packed Aerospace Engineering course loads and part time jobs. Given the entire summary I've given you, the fact that it worked on the ground, and got off the ground at all, I hope my assessment that it was a minor miracle is shared.

After Me

When I left the project, I was very concerned about leaving the incoming and to-be-chosen team in the same position that I had been in when I started. I was worried about a team of people that didn't know what they were doing doing their best but inadvertently making things worse, just like I had when I was new. So I spent a decent chunk of the summer writing the handover guide overexplaining everything as best as I could with the explicit instruction to open a line of communications to me and to not feel like they were burdening me by using it.

I finished that handover guide sometime in the Summer, and around that timeframe I watched a Falcon 9 carry Cygnus NG-21 into orbit. Half of the programming umbilical was hanging up on my bedroom wall above my laptop, and I was filled with the realization that the other end of that cable was now in space.

Then, the Cygnus had issues. Not one, but two aborted thruster burns. The thought that it might never reach the ISS crossed my mind. A small part of me, for a moment or two, selfishly hoped that the Cygnus would fail, as if CySat-1 failed once it ejected from the ISS, there was a near 100% chance that it would have been my fault. But if it never reached the ISS, there would have been no ambiguity, no wondering, it would have been something else completely out of my control. I did my best to shut that part of my brain up.

Thankfully, the issues with the Cygnus were resolved and it made it to the ISS perfectly fine.

A few weeks after the Fall 2024 semester had started I hadn't heard from the new team yet, so offhandedly I checked the documentation folder and I found a newly created document of open problems with the satellite's code, one of which was severely in error and I contacted them letting then know, with the assumption that they were in a similar spot to where I was when I had joined the project. This was noted and no more communication was had for several weeks, until one day I looked in and saw that the satellite had been ejected from the ISS weeks ago and nobody had told me.

I quickly send off an email, they respond back reporting that no contact had been made with CySat-1, followed by what appeared to be at first glance a bunch of bad assumptions that a new and inexperienced team would make. My assumption was that they had not set up the communications properly and had not been able to contact the satellite. I respond back doing my best to help, and...

As this satellite loves to keep teaching me lessons, the opposite situation happened than what I had prepared for, as with numerous times during this project. This was one or two small errors by a team significantly more competent than any team I had ever been on. They had like six years of CubeSat experience between them, had contacted international partners with massive satellite dishes in an attempt to contact the satellite, and had went through my code with a fine toothed comb and picked out every mistake that I had ever made.

To my knowledge there are no other CubeSat programs at Iowa State. These people must have either transferred in or got CubeSat experience in high school. Which... No wonder I'm having trouble getting a job. The only thing I have going for me is two years working on the most scuffed satellite to ever exist, bumbling from one failure to another and succeeding only by trial and error, and I thought that was worth something. But no, by chance our team was not the greatest, and in reality, the thing I labored so hard on and ultimately failed at is easy and common enough that the team that came after me had multiple people who had done this stuff in high school and were immediately able to come up to speed and figure out a bunch of things I did wrong, and deemed my ramblings so meaningless and useless and possibly condescending that I was never contacted except in response.

(This part is almost certainly hyperbole, me catastrophizing the actual events, but it is a snapshot of what went through my brain for a few hours after I found out, before I came to my senses and did my best to work the problem)

Well firstly, where were these people when we needed them, but secondly, that's destroying me. Not only am I quite possibly incompetent, but I came off as arrogant and condescending when I was trying to help. And to me that's worse than if the satellite had simply failed. This entire project has been a long series of me trying as hard as I could have reasonably been expected to and being repeatedly told "No. You're wrong. Do better." And I kept going. And I kept trying to be better. I'm miles ahead of where I was, but I'm still miles and miles away from where I need to be to succeed.

Whenever I think CySat-1 has run out of lessons to teach me, it delivers another devastating punch to the gut. At this point I would not be surprised if when it re-enters it somehow survives and hits me on the head as if to say "You weren't using that for anything anyway!" I suppose there are no lessons to be taught from success. But the lack of success sometimes has me wonder what I'm going through all this failure for, or if success was ever in the cards for me...

Whenever I failed at something, I always expected to feel one way but ended up feeling another way. I spent high school as the smartest person in the room, and the first half of college desperately trying to get that feeling back, and then the first and second semesters of CySat realizing that I was the smartest person in the room and desperately wanting to not be, wanting someone smarter than me to come in and save the day. When CySat failed to meet its handover deadline that first time I expected to feel sad and angry, instead I felt relieved. The second time CySat failed handover I expected to feel relief, instead I felt empty and sad. I expected the third attempt at handover to be triumphant, instead I felt nervous and like I could have done more. I spent ages terrified of breaking the satellite, each time I thought I broke it, it was actually okay, and when I finally broke the satellite for real, I was calm, and was able to fix it in under two days. And when CySat flew and failed, I expected to be disappointed, I didn't expect to find out that I was hurting others in the process of trying to help, and I certainly didn't expect to uncover the possibility that the second hardest thing I've ever done really isn't anything that uncommon.

CySat-1 has taught me a lot technically and emotionally. But mostly I expected it to teach me to never give up. And instead I got an outcome that is at first glance nearly identical to the one I would have gotten if I had given up two years ago. I'm not sure what it is trying to teach me this time. Maybe it isn't done yet.

I hope it isn't done yet. Because this would be a sad ending. Though this project has certainly made me be careful what I wish for, so maybe not...

Story 3 - The "Invalid Telecommand ID" Incident, The Most Enigmatic Bug I've Ever Had To Deal With

Spoiler

In an effort to not end the first post on that rather depressing note, here's a fun-in-retrospect story about the team constantly barking up the wrong tree!

The seeds of this incident were likely planted in early 2023 or perhaps late 2022, and the root cause was not discovered until the last few weeks of the Summer of 2023, to give you a vague idea of the timeframe here. I could pull up all the Git commits to give a day by day but that's too much effort for a campfire story (though even that wouldn't be very accurate there as I'd frequently go days without pushing (I have much better Git discipline now)). Some additional timeline context. I will remind you that we were handling dozens of other issues at the same time, with a small team only obligated to put in 3-10 hours a week each depending on what level people were at, all of whom were juggling a heavy course load, and some of whom were also working part time jobs. So if something seems like it is taking longer than it should, mentally readjust your timescales to account for those factors.

One day, we were testing the UART link and associated Command and Data Handling software between the OBC and the ADCS. We noticed that the satellite's ADCS sometimes worked and sometimes didn't work. We had an intermittent bug on our hands!

Over the next few weeks, the enigma only got worse. At that point, we knew, the ADCS went through periods of full functionality and very limited partial functionality. During the limited functionality periods, the ADCS would return an error code that meant "Invalid Telecommand ID" when we sent it any telecommand with an ID greater than 9. Sometimes it would start and stop working several times in the same day, sometime it would spend days or weeks without changing (and later, months).

Our first culprit was a bug in the code that the OBC ran to talk to the ADCS. Specifically, we wondered if any strings were getting mishandled. We were suspicious that one digit telecommand IDs worked and two digit telecommand IDs did not. 9 would work but 10 wouldn't. At first glance this would appear dumb as computers don't talk in base 10, but for some reason, Endurosat's subsystems all talk in strings. I find that a bit baffling, but you do you, Endurosat. In this case, an error in string allocation would make sense, but not why it was intermittent.

This wasn't particularly likely as the ADCS used an integer byte or two for the telecommand ID, and in binary the transition from 9 to 10 doesn't have any special significance. Just to be safe, we went over everywhere the ADCS used strings (which might have been nowhere), and that turned up nothing, and then for good measure we double checked the allocations on all arrays related to the ADCS. Also nothing. In our full sweep of the ADCS code, we did however catch one bug where a while loop could have gone on infinitely in some edge cases and soft-locked the satellite. Though that was definitely not the issue.

This was complicated by the fact that whenever the ADCS started working or stopped working, we would interpret that as being related to the most recent change we made. Sometimes we would think we fixed it and it would show up again weeks later, and sometimes we would think we broke it and spend a lot of manhours trying to figure out how the piece of code we just changed broke the ADCS again - Such a small, inexperienced, and disorganized team (much of which is on me, I'm not knocking them) was having a little bit of trouble trying to separate correlation from causation.

One of the times it stopped working, we noticed that the ADCS was making physical contact with a part of the boost board - This was not supposed to happen. The satellite has spacers to prevent this from happening, but the threaded rods and the spacers that go on them are an, excuse me, pain in the periapsis to set up, and during this phase of testing, the satellite's subsystems were being frequently stacked and de-stacked and held in by only their PC-104 pin connectors with no additional support. Additionally, the spacers were in the process of being redone, and had been assigned to one team member. I'm not here to drag anyone through the mud. Despite the project's ultimate failure, I can say that everyone gave this project a good amount of effort - Except for spacer girl. She kept saying that progress was being made and had nothing to show for it, always had an excuse to miss meetings, and at the end of the semester, hadn't done a single thing, and left us with no spacers (and also no power budget but that's another story). Granted she may have had extenuating life circumstances she did not tell us about, college can be stressful, and in that case, I apologize entirely to spacer girl. But I digress.

The lack of spacers, combined with the short and moderately wiggly ADCS to boost board connection, and the tall boost board, meant that the top of the boost board would rub and scrape up against the bottom of the ADCS, and looking closer, I saw what looked like a set of pads with a little bit of solder residue on them directly on the point of contact. The focus shifted away from an OBC code issue to physical hardware damage. As much as I wanted to have this issue solved, I did not want the culprit to be a broken $40,000 ADCS, the most expensive single subsystem on the satellite. This was a bit of a stretch, but this also offered a (flimsy) explanation for the intermittent failures - Sometimes the boost board made contact and might have bridged the gap across the pads, and sometimes it might not have.

We emailed CubeSpace in search of circuit diagrams to determine what this missing component was and to see if it could be replaced. They took a while to respond as the entire office was on vacation, but my interactions with them were pleasant and productive, shoutout to CubeSpace's customer support team. As it turns out, that spot was empty and contained an optional feature that we didn't need, and I'm not sure why the residue was there (maybe I was just seeing things). It just happened to be at the exact point of contact, and it was not physical ADCS damage.

This then turned our sights to the Boost Board itself.

Now, to explain the boost board... Our EPS does not have a 7.4 Volt bus. A few of the ADCS components require 7.4 Volts to function (or maybe certain versions don't but there were mistakes during ordering, I may be mixing stories up here or maybe I never heard the full story). So, previous teams had bought a step-up voltage converter and wired it to a square of prototype board, and cut a few pins to make a makeshift 7.4 Volt bus to properly power the ADCS.

The boost board was getting pretty warm. At the time we thought this might have been abnormal, so we grabbed a multimeter and checked some of the voltages on various parts of the boost board. The input voltage was 5 Volts, and the output voltage was, drumroll... Also 5 Volts. So, broken, and there was discoloration on some of the electrical parts to add to the evidence pile.

At this point I think we were thinking that the boost board issue was a completely separate issue from the ADCS issue. After all, why would a faulty power converter make the ADCS not accept 2 digit command IDs?

We figured out what model it was and ordered a new one (in retrospect not a great idea to order the same one) but by the time it came in, the semester had ended and it was Summer. Neither me nor the other researcher/intern (I feel really weird calling it an internship but she does so idk) knew how to solder. I had always wanted to learn but whenever the chance came up it was always for something on the flight hardware and I was a bit irrationally afraid of messing it up as I didn't want the first thing I soldered to be flight hardware. So we had to wait until someone else on the team was available to come in and solder it for us (there were probably other people there but we didn't know anyone and technically they wouldn't have been cleared to go into the clean room but in retrospect I probably shouldn't have cared about that as much as I did).

We went into the clean room and turned on the soldering iron in there, and - It was broken.

So we brought the boost board (referring to the prototype board with the boost board attached to it) out of the clean room, and we broke (not really) into the empty electrical engineering building, where I watched him desolder the old boost board and solder the new one in place. We brought it back to the clean room, reassembled the satellite stack, turned it on, and...

...Still nothing!

Then we brought out the multimeter again if I remember right, and then we noticed some weirdness with the voltages. I opened the EPS reference manuals, the EPS debug program, and stared intently at the boost board, and I don't remember the exact order of events, but...

...Back at the very start of the project, before I had even successfully managed to install the IDE or, likely, so much as touch the OBC once, I was assigned to be the EPS person, I would learn as much about the EPS as I possibly could. So I read through both of the EPS reference manuals provided by Endurosat. At this point I was an incoming Junior and my knowledge of electricity ended somewhere just slightly after Ohm's law (a slight exaggeration for comedic effect, I think I had passed Physics II by that point).

I read about this setting called EPS LUP 5V. I came away from that section with the impression that it was a setting that could toggle latch up protection for the 5V bus at the cost of some extra power consumption. For some reason there were also some pins assigned to LUP 5V, but I assumed they might have been for toggling it or something, or maybe they were outputs for toggling something else that required 5 Volts to toggle. Or maybe the 5V pins also output 5V but this setting toggled the latch up protection for those pins only. I remember being confused by the purpose of this functionality, but I came away with the impression that it was an optional safety. As this didn't appear to be important to the functionality of the satellite, I never really thought about it much after that, and that explanation stuck with me as I never needed to revisit that section of the reference manual.

So cut back forwards to the Summer, and I noticed that the boost board was not actually wired up to be powered from the 5V bus, but was instead wired up to the LUP 5V pins! What??

So I reread the manuals, and... LUP 5V isn't a protection for the 5V bus, it is a completely separate, toggleable, and latch up protected 5V bus! That EPS setting doesn't turn on and off the protection, it turns on and off the entire bus!

So I figure that this is the cause of the boost board issue at least. Still no progress on the ADCS issue, but that pin should at least be getting 7.4 Volts now, so I turn it on, I check the Voltage, it is 7.4. The old boost board probably wasn't even broken. But then I booted up the satellite, and...

...The ADCS started working again.

WHAT???

WHY?????

But then the pieces all clicked.

Let's go through the order of events with hindsight.

That day that the problem was first noticed was also the day someone (likely me) was in the process of moving the power-on code from one location to another to better mesh with how the RTOS handled tasks. I don't recall the exact setup, but I would be confident in saying that at least two laptops were out, taking turns pushing code to the OBC for testing, and someone was also likely using the EPS's USB debug program for testing purposes as well. So there was me, messing with the new power-on code, and there was one of the junior programmers, messing with the ADCS code, with the old power-on code as I hadn't pushed the new stuff to GitHub yet. I likely had also been testing ADCS stuff from my laptop in an attempt to help out, and I probably had the EPS USB debug program open as well.

If EPS LUP 5V is on, the ADCS works just fine. If ADCS LUP 5V is off, the ADCS says "Hey I've got a problem! There's not 7.4 Volts on this pin!" and partially fails, but either lacks a dedicated error code for that situation and spits out the "Invalid Telecommand ID" error, or that error is also supposed to mean "Invalid Telecommand ID In this power configuration." Though that last bit is just a hunch. The ADCS manual is pretty long and I did only skim it and try a few CTRL+Fs, but I never saw anything about that error meaning anything but "This integer does not correspond to a valid telecommand." This would also neatly explain the one vs two digit behavior, as the lower numbered telecommands tend to be simpler - If the ADCS boots at all, the first few telecommands will work as they deal with incredibly basic stuff. It just so happened that the cutoff between functional and nonfunctional telecommand IDs occurred at the same point as the transition from one digit to two digits.

Now from my recollection, the old power on code explicitly turned EPS LUP 5V on, and the new code did not, or was in a state of flux where it may have not or may have. This seems like an easy correlation to notice - The ADCS works when you push code from one laptop but not the other - but we were solving other ADCS bugs that day, I was probably constantly toggling the EPS settings (possibly including EPS LUP 5V) to see if they would help or to test them, and we weren't really firing on all cylinders yet, so any correlation got lost in the noise of the day.

I finished the new power on code without EPS LUP 5V being explicitly set because I didn't believe it to be important. From then on, every time someone on the team revisited the EPS code, or was toggling the EPS settings in the vain hope that it would solve whatever other problem had come up, sometimes, EPS LUP 5V would get left in a different position to where it had started the day in. Generally we assumed that the power on code would put all of the important settings back in the right place, so we didn't pay attention to where we left the settings as they would be changed around and reset constantly. Then, the next time someone came in to mess with the ADCS (which could have been days later), it would have, from their perspective, randomly started or randomly stopped working, prompting them to dig through their recent code changes to find out why, but the culprit wasn't there code, it was a line of code that had been removed months ago, and the settings that had been changed dozens of times via other people's code and the USB debug program, persisting in the flash memory of the EPS.

All that combined meant that it was very difficult to trace the correlation after that first day. This was then followed by a triple whammy of red herrings:

The exact phrasing of the error code and the number at which the valid telecommands cut off led us to believe it was our OBC <-> ADCS communications code. Then, the bare pads on the bottom of the ADCS right at an unintended contact point led us to suspect hardware damage. Then, a discolored boost board and some dodgy multimeter measurements (I'm still not sure why the boost board read 5V that day, likely operator error) led us to believe that the boost board was broken.

This was also compounded by long periods of inability to act, waiting for CubeSpace to respond, waiting for the boost board to come in, and waiting for someone to be able to resolder it. Granted, we were working many other issues (some significantly more difficult) during this period. So it is not like we had nothing to do.

And it all led back to the very first thing I did on the project, reading the EPS manuals when I didn't know any better, and one unimportant seeming misconception that stuck around with me for months afterwards.

But in simple English, the ADCS wasn't working because we had not turned part of its power supply on!

Now this begs the question. Why did prior teams wire up the boost board to EPS LUP 5V and not just to the 5V bus? And I don't know. It's not like the 7.4V bus needs a toggle. And it already has one in the form of the boost board enable toggle (which we did test extensively). Maybe the power switch within the boost board is leaky and tying it to EPS LUP 5V would eliminate the slight power draw of the boost board when it is off, by disabling EPS LUP 5V? That's the only reason I can think of why you would wire such an inconsequential component up to two separate power toggles. Which is a little bit of a shame as there was something better we could have used that EPS LUP 5V for later on if it was still available... Though that is a story for another time.

But anyway. That's the "Invalid Telecommand ID" incident of 2023. A lesson, well, not in systems engineering per se, but in systems testing and systems debugging. Often times, the evidence all points in the wrong directions, and the true root cause is not at all where you expect it to be. It taught me to look for solutions in places that aren't obvious, to not immediately jump to conclusions, to not assume "Oh that bit of code doesn't do anything, I can probably delete it," and to not be too confident in the understanding gained from a first pass reading of a document that is a little bit over my head.

Another big takeaway is that whenever I hear a really stupid engineering headline - Like "SpaceX power outage procedures were stored on a server that went down in the power outage" or "Boeing Starliner fails because the clock was set incorrectly" or something along those lines, which points towards someone being really stupid, I remember the several months of confusion my team experienced over a problem that can be summarized in clickbait headline fashion as "Engineering students flabbergasted that spacecraft they didn't turn on isn't working!" There's often a lot of context behind the headlines that you don't hear. Sometimes a stupid sounding mistake is really a stupid mistake, but my experience with this bug reminds me to give the people behind those headlines the benefit of the doubt. Complex problems can be made to sound dead simple, and simple sounding problems can turn out to be extremely complicated.

Stories I haven't told yet but might at some point

How we broke several thousand dollars worth of solar cells
How I finally contributed in a meaningful way to the hardware
Our two failed attempts to meet launch deadlines
NOAA and remote sensing
CySat-1: A Geopolitical Nuisance to Iran's Reconnaissance Satellites
The balloon flight test
Deciphering the UHF's Packet Protocol, and my happiest moment ever on the project
Interrupt Priority Conflicts
How one misspelling meant the ADCS didn't actually work
The Surprise Carrier Board Swap
The worst solar power testing setup in the entire world
The Batteries
How I fried the OBC and recovered in under 2 days
The Ground Station Software, an exercise in multithreading
The emotional side of the journey
The test files, "And what is the use of a book," thought Alice "without pictures or conversations?"
The LNAs and my hardware incompetence (including reverse engineering the machine used to mint an EE Master's Degree)
An exercise in improper calibration
The broken cable incident
The Roomba cable incident
The part where we stole mosfets from a submarine and sent them into space
The part where we thought about stealing another professor's radiometer
The trouble with the radiometer's power consumption
The screen going blank and our efforts to re-image the satellite's SD card
The @*&^#!*&^ antenna
The !*&@#^*&^ ground antenna
That time we plugged the entire satellite in wrong
The RTOS and random freezing
The ADCS off by one error
The ADCS and TLE APIs
The story of the sun sensors (which is also to an extent the story of the solar panels)
A few small anecdotes of me being in waaayyyy over my head
CloneComm and how it was named
The early days when we didn't know what we were doing (LED debugging, how to turn it on, the pumpkin board, etc)
The grading fiasco and why I accidentally graduated several months late
How not to design for assembly
That time we almost costed the university a fortune in storage fees by accidentally pushing a multi gigabyte radio recording to Github
How we almost launched CySat into space with the RBF pin wired incorrectly
The never ending struggle to keep the roller switch wires from falling off
How I picked up the pieces of my shattered dreams, glued them back together, and moved forwards stronger (hopefully, this part is still being written)

As you can see, there are a lot, and those are just the ones I remember off the top of my head. Gather around the campfire, because we may be here for a while! Let me know if there's any particular story titles that catch your eye or if there are any questions you have.

AckSed · December 20, 2024

I read the entire thing.

Man, I think I experienced roughly the same feelings reading it: glad you had the opportunity, wincing at the mess-ups. I sympathise with feeling out of your depth at uni, though not to the extent of working on hardware that was made to be launched. All I did was a BSc in Internet Computing circa 2009, and my ill-chosen final-year project nearly broke me. (Don't choose fuzzy-logic when you barely know how to make a database.)

I would like to hear "How not to design for assembly" next.

Sign In

The Ballad of CySat-1 - Tales from an Undergraduate CubeSat Team

Recommended Posts

Ultimate Steve

Link to comment

Share on other sites

AckSed

Link to comment

Share on other sites

Join the conversation

Forum

Activity

Community

Mods

Social Media