Jump to content

Coding blunders


Geonovast

Recommended Posts

i wrote a gui script in lua to control an arduino over a serial connection. i kept having this problem where the serial library i was using would randomly stop working, and i got fed up with re-launching the script from the command line (i kept instinctively closing the cmd window when i was done). so i came up with the bright idea to add a button to launch another instance of the script with a shell command. somehow i accidently used the idle callback instead of the click callback. needless to say when i loaded the script, i was immediately greeted to thousends of instances of my script as the thing spawned exponential instances until i ran out of memory.  ikt was reminiscent of some of the computer viruses we had in the '90s. 

Edited by Nuke
Link to comment
Share on other sites

When we got deep into rewriting most of the CKAN bot in Python, I got burned over and over by the fact that an object is just a dict in Javascript but not in Python, since otherwise the languages had a pretty similar feel to me:

d = { "a": 1, "b": 2 }
x = d.a                         # This works in Javascript
obj = SomeObject()
y = obj["property with space"]  # So does this

Finally had to set up Mypy to catch all the chronic AttributeErrors and TypeErrors before runtime.

Edited by HebaruSan
Link to comment
Share on other sites

Mine's scripting related, is that good enough?

Back in the deep dark days before Windows, I needed to consistently add something to my dos PATH variable, but it couldn't be in there by default so I didn't want to just set it every time. So I made a batch file that was just

PATH %path%;<whatever I wanted, I really don't remember>

I named this batch file PATH.BAT, and ran it.

It did not immediately exit like I'd expected it to. It sat there, running. Curious, I hit Ctrl-C and typed PATH to see what my path was.

The same thing happened. I became concerned. Was something wrong with my computer? Was accessing the PATH variable somehow locking something up? I hit CTRL-C again, grabbed the DOS user manual (this was long before the Google, and even before the Web) and found that I can also find my path by typing SET PATH. I did so, and found my path was the original PATH variable and nothing more. More importantly, there was nothing that showed there were any problems.

I don't know how many times I went around the problem but I eventually found it. PATH.BAT, in my local folder, was running when I (or it) ran PATH. It had priority. So my batch file was running itself, over and over.

That's how I learned about recursive batch files.

I renamed the batch file to "PATHIT.BAT" and TO THIS VERY DAY my little DOS utility programs have "it" at the end. And yes, one of them is named "pathit" though it does a bit more nowadays than just set the path.

Link to comment
Share on other sites

Well… Since we are talking Python here… :P

Long background history follows - it's interesting to read it do contemplate the huge problem I created, but the thingy became too long. :P 

Spoiler

We are a glorified Service Bus - instead of the Client implementing support for his contractors directly into their systems, and then being screwed up when the contractor leave, they hire us for doing the job on their systems and we then handle the contractors - they leave, it's up to us to adapt our middleware to talk to the new guys (saving the client of any changes).

And we are talking really big systems here running on infrastructure 30 years old (really big players don't change infrastructure, it's too costly and dangerous for their operations), so changing contractors are always a pain in the SAS. The kind os machines these guys use on their landscape are beyond anything I ever had seen with my own eyes.

On the other hand, these same contractors are not exactly sad having us as the man-in-the-middle because they don't want to pay top-dollar for specialists on such legacy systems. These clients are big, and they want their money - but it's too costly to maintain a whole team of specialists just for 1 or 2 big clients - it's way more profitable for them to share the money with us and let us handle the mess.

Oukey, now you know how deep is this rabbit's role. :D

3 years ago, we at day job were being contracted by our first really big costumer. Dude, that guys ARE big - thousands and thousands of transactions at day. And I was writing a service to distribute some legal documents they need in order to dispatch the trucks around the country (and Brazil is a big one).

But… Sheet happens, these pesky white pieces of paper insists on falling from the skies on the most inconvenient time and ways.

Spoiler

Who likes Sheet Storms? :sticktongue:

vector-illustration-businessman-suit-bri

Sometimes the Client's systems are out-of-reach (later we diagnosed a major blunder on their cloud network provider), sometimes is the partner' who is out-of-reach - and synchronising them on a mission-critical operation where a truck full of goods would be held until you deliver that Kraken damned electronic legal permission didn't helped on making things easier.

So we decided to bufferize everything. We would scavenge the partner's systems for such authorisations and preemptively made them available on a storage as soon as they are made available by the government, so when the Client hit us asking for them, we would have them ready to be delivered without hitting the partner and asking for tons of authorisations on peak time (ruining the day of their DBAs).

Well, the Client didn't wanted to host the Storage. The Client didn't wanted such Storage to be host outside the country. And we didn't wanted the partners pushing the files on our systems due the constant downtimes on the communication - we would have to poll them all the time for anything we had missed, again screwing up their databases and defeating the very purpose of implementing the push service at first place.

So we had compromised on using a storage service hosted on a separate machine on their infrastructure. They feed the Storage, we empty it.

So now I had to built a customised way to access that Storage the fastest I could as we are on the critical path of mission critical and highly tight on the timings process, so I cooked up a Connection Pool for the Storage Service to avoid connecting and authenticating on every access.

I made (almost) everything right - kept the connections alive regularly and replaced it with new ones if they dies - so the Client would have instant access when needed.

Except that I forgot to handle one exception inside a mutual exclusion lock and when that one exception that I didn't had foresee finally happened and leaked it into that critical section, the whole thing halted silently in a dead lock. And that crap, obviously, happened early early morning, when the authorisations for the leaving cargo at morning should start to get being fetched.

At first light I received a call with a desperate dude yelling that he had LOTS OF TRUCKS FULLY LOADED waiting for permissions that were not arriving.

By the time I diagnosed and fixed the problem, they had FOURTY TRUCKS waiting for me on the parking lot.

It took 4 more hours to retrieve and process all the permissions from the partner's systems (while they were calling us swearing because I was DoS'ng their services). At Noon, the last truck that should had departed at 6:00AM finally left the parking lot and we stopped screwing up the partner's services (they shoved us on a dedicated machine after that).

Good thing I was already working remotely. Had I being working on premises, the truck drivers would had hunted me and squashed me inside the server :P - most of them drove late night because of this stunt!

Simplifying things a lot, this is more or less what I did:

with self.THE_LOCK:		# where lock is a threadling.Lock object
	all_connections = self.connections.keys()
	do_some_stuff()
	while all_connections:
		c = all_connections.pop()
		c.acquire_internal_lock()
		if not c.is_alive():
			self.connections.pop(c)
			c.close()
		c.release_internal_lock()

And this is more or less what I should had done:

with self.THE_LOCK:		# where lock is a threadling.Lock object
	all_connections = self.connections.keys()
	do_some_stuff()
while all_connections:
	c = all_connections.pop()
	c.acquire_internal_lock()
	if not c.is_alive():
		self.connections.pop(c)	# thread safe operation
		try:
			c.close()
		except:
			# It's dead, Jim!
			pass
	c.release_internal_lock()

Explain: we are talking multithread here - so the c.close() is a cross-thread trigger, where the close would call some house keeping business on that thread loop and exit - but the house keeping code would wait for the internal lock to be released first.

The internal lock was a MUTEX created using the client's credentials+service_address as name - important, because that Storage was not reentrant and I had to prevent two threads trying to access the same store at the same time if by some reason the client asks for more stuff before I finish the last call.

If c.close() blows an Exception on the first version of the code, the whole running thread exits in error and the internal lock of that connection is never released! So any attempt to recreate the connection to that Storage using that credentials would lock forever.

On the second version of the code, if c.close()screws up, the thread keeps going, release the insternal lock and finishes normally.

(removing the while loop from the THE_LOCK Lock is an optimisation - I don't need to halt the whole shebang, I only needed to lock the do_some_stuff)

— — POST POST EDIT — — 

Interesting enough, I just realised I still have a flaw on that code!

The REAL good code is:

with self.THE_LOCK:		# where lock is a threadling.Lock object
	all_connections = self.connections.keys()
	do_some_stuff()
while all_connections:
	c = all_connections.pop()
	c.acquire_internal_lock()
	try:
		if not c.is_alive():
			self.connections.pop(c)	# thread safe operation
			c.close()
	except:
		# It's dead, Jim!
		pass
	finally
		c.release_internal_lock()

The previous code was somewhat optimistic - if c.is_alive() would raise an exception, I would have the original problem again. Ok, I know it will not because I wrote that thing to consider an exception as evidence of death - but someone maintaining it later will not know that (not to mention that additional code would be needed later and then the assumption would break).

This final form of the code is the one I had should had written. Well, I'm fixing that code now. :D 

(It's interesting how the human mind works: "you need to get out of the island in order to see the island…")

Edited by Lisias
"He's dead, Jim!"
Link to comment
Share on other sites

12 hours ago, Lisias said:

At first light I received a call with a desperate dude yelling that he had LOTS OF TRUCKS FULLY LOADED waiting for permissions that were not arriving.

By the time I diagnosed and fixed the problem, they had FOURTY TRUCKS waiting for me on the parking lot.

Ah yes the best time to debug, and fix problems with highly critical code. When someone is yelling at ya ;D

Link to comment
Share on other sites

1 minute ago, MKI said:

Ah yes the best time to debug, and fix problems with highly critical code. When someone is yelling at ya ;D

Having a "gun" pointed your heads :sticktongue: is a hell of a motivator!!! :) 

Link to comment
Share on other sites

Back in my university computer science days, I took a required Fortran class.  One day late in the semester we received a particularly difficult assignment, but it was only worth 20 points.  At that point in the semester, my classmates and I were trying to finish up several other difficult projects that were going to impact our grades more than this 20 point project, so most of my classmates decided to just not do this Fortran project.  They were willing to just accept a "zero" on this one. 

But I thought about it a little bit, and thought I could do the assignment.  I wrote the program, got it working, and got correct results.  Good!  Twenty points for me!

The instructor passed our printouts back to us in class (with grades written on them).  On my printout, he had written a note that said he had told the class never to mix floating point and integer math.  I don't remember if he actually ever said this to us, maybe he did.  Turns out that our Fortran compiler allowed for this (and corrected things behind the scenes), so my program "worked" properly, even though I had mixed floating point and integer math.  In fact, I had mixed floating point and integer math 23 times in my program.  The instructor took 1 point off for each time I did this, resulting in a score of -3 on this project for me.

In front of the class (probably not a good idea), I said something like this to the instructor:  "So I should have blown this project off, like most of my classmates did.  I would have gotten a better score of zero."

"Yes."  was the instructor's reply.

This was a long time ago, but I stay in touch with several of the guys who were in this class with me.  They like to remind me of the time they scored better than me on an assignment by not doing it.

Link to comment
Share on other sites

This thread is quite old. Please consider starting a new thread rather than reviving this one.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...