I did this

I was testing the Office Timeline August 2016 release candidate and the software generated the timeline below. I’m really proud and I am sure I can say without false modesty that it looks awesome. All I did was import data from a Microsoft Project file and then change the default style to Round.

1

Goodbye, old friend …

Goodbye, C++ , I kind of like much more another programming language now …

Yesterday I had to service a request from a tester: there’s an legacy application that’s still a money maker. It has been developed in Delphi 7 and it uses .INI files to store the application configuration. It’s a pain to maintain it, let me tell you …

There are two INI files containing exactly the same data, something like:

[Tests]

WBC=WBC

RBC=RBC

IMAGE001=SCAT_WBC

IMAGE002=DIST_PNG

FLAG001=Neutropenia

FLAG002=Neutrophilia

NRBC=NRBC

FLAG003=Anemia

So the tester wanted to split the [Tests] section into three sections, [Tests],[Images] and [Flags], and sort the lines inside the sections by key. I assume this would make it much easier for him to inspect the .INI files (I do not get it why we wants to do that, he should test the application using the black-box method). Of course I’d had to update the legacy Delphi 7 code to properly read and then write back the INI files, which I’ve immediately told his boss was not an option. They’ve accepted instead my solution, which is to implement a command line tool that reorganizes the [Tests] section in a more readable way: all “true tests” together, sorted by key, followed by all “Image*” lines sorted by key, followed by “Flag*” lines sorted by key.

I was not allowed to use C# to implement this utility, because the target systems still run an older version of .NET Framework. So, C++, back to you.

Boy, did I google a lot while writing the tool: how to read .INI files, how to parse the result of GetPrivateProfileSection (in the end I wrote my own state machine for this), how to sort a map (google told me the map is sorted by default by key🙂

I spent a good half an hour figuring how to properly append a char to a string; first, I was calling s.append( c) and the compiler errored with a really unhelpful message. Once I’ve figured I should call s.append(c, 1) the code compiled, but it didn’t work. That’s because the C++ people are thinking backwards when it comes to the order of function parameters, I was calling s.append( c, 1 ) instead of s.append( 1, c ).

Then I was placing the final null terminator over an existing null terminator; another 10 minutes to figure it out.

Another 15 minutes to figure out a normal way to implement string.StartsWith (you should use mismatch for this).

When it was all done and refactored and ready for inspection, I’ve took a good look at the code. Even with auto keyword, iterators, algorithms, STL, the code is much less readable than the C# code. So, goodbye, good old C++ friend (pun intended:-), I’ll enjoy C# even more from now on.

Something to follow

FuzzyDupes:NET Assembly Documentation

Fuzzy text matching algorithms links

Algorithms for Approximate String Matching”, E. Ukkonen.

Simhash: http://matpalm.com/resemblance/simhash/

https://en.wikipedia.org/wiki/Longest_common_subsequence_problem

https://fuzzystring.codeplex.com/

http://chairnerd.seatgeek.com/fuzzywuzzy-fuzzy-string-matching-in-python/

https://blog.nishtahir.com/2015/09/20/fuzzy-string-matching-using-cosine-similarity/

http://www.tsjensen.com/blog/post/2011/05/27/Four+Functions+For+Finding+Fuzzy+String+Matches+In+C+Extensions

http://www.decisivedata.net/blog/cleaning-messy-data-sql-part-1-fuzzy-matching-names/

http://ntz-develop.blogspot.ro/2011/03/fuzzy-string-search.html

 

Excel Fuzzy Lookup Add-In (from Microsoft Research): https://www.microsoft.com/en-us/download/details.aspx?id=15011

 

Jaro-Winkler algorithm

Cosine similiarity

 

 

 

 

Bella got its revenge

Yesterday it was a big day for my pets: yearly inoculation day! While Bella deeply enjoys car rides (it’s a dog, after all🙂, Magellan deeply hates being transported by car, especially since it has to do so in a pet box. The first time we tried to travel with it defecated in the box and I had to stop for a clean. Us being its masters, Magellan does not hate us, but it has its ways of protesting.

As my vet told me often, cats have an extra sense – when it’s time to pick them up for the vet visit, they know it well in advance, so they prepare well in advance their escape. Magellan has planned something, but not well this time – it went under the couch. In a moment of inspiration I told Bella to help me, and then I’ve tried a few things to get Maggie out; not a chance, until I’ve got exasperated and I’ve directly lifted the heavy couch. Maggie jumped out, rushing for the alpha site, when Bella put it’s front paw on it, with its entire weight of 8.8 kgs behind it … and Maggie stopped, really stumped.

Easy-peasy then, just picked up the cat, placed inside the pet box and off we went. Bella went ecstatic the entire day. I am certain part of it came from the fact it got its revenge for all the moments the cat has ambushed it.

E plin de ziaristi incompetenti

A ajuns la os de acum … e plin Internet-ul de articole cu informatii incomplete. Ultimul articol citit intervieveaza un head-hunter, care spune “salariile pentru programatori bun variaza intre 2000 si 4000 de Euro”. Bun, e vorba de salarii brute sau nete? Pentru ca daca e vorba de brute, salariile nete variaza intre 1000 si 2000 de Euro (atit de salbatica e taxarea muncii in Romania). Iar ziaristii incompetenti nici nu se gindesc sa analizeze complet, sa clarifice, sa adauge niste intrebari … lasa sa auda tzatza din tramvai care merge sa se certe la piatza ca unul din IT are salariu 4000 de Euro, ca sa aiba motiv sa-i zica vreo doua unui tinerel cu ochelari care nu se da la o parte din calea ei in timp util (pentru ca nu are unde, nu pentru ca nu vrea sau nu se simte).