Just spoke with an old Romanian friend of mine who told me this joke, which was famous in Romania during the “old times” and I thought I’d put it out here as it’s bloody hilarious. Quite likely “oldies” like me will laugh at this, but for the Romanian youngsters out there probably means nothing. Sic transit … so to speak (As a heads-up, this is a Radio Yerevan joke.)
Radio Yerevan gets asked:
Is it true that comrade Stoianov won a family holiday including the airplane trip to New York, USA and back following your last week’s radio broadcast?
Radio Yerevan answers:
Yes, that is true, however, a few corrections since you haven’t got your facts quite right:
- It wasn’t a a family holiday, it was just for him
- And it wasn’t a holiday, it was just a trip
- And it wasn’t to New York, it was to Moscow
- And it wasn’t by plane, it was by bike
- And finally: he didn’t win it, he lost it!
Thank you for your question!
I’ve decided to plug in Cobertura in (some) of my projects to have an idea on the unit test/code coverage going on. I use Gradle, so I started looking at the Cobertura Gradle plugin. It turns out it’s pretty good — and offers a lot of the functionality that I needed. However, I came across a (weird) issue so I’m going to investigate it and present the findings in this post.
First of all, I’m planning on using Cobertura as part of the build and have it fail the build if the coverage is falling under a certain threshold. This means that if we start with a build that’s acceptable (in terms of code coverage), any changes made in the future need to have accompanying tests, decent enough to keep that threshold within acceptable limits. It turns out this is easily done with the plugin by setting
coverageCheckHaltOnFailure coupled with
coverageCheckTotalLineRate. To start with I’ve set up the threshold quite low and give it a spin…
Turns out my project failed right away as it had occasionally line coverage falling under 20%!!! WTF?
I set off to look into it — turns out there’s a few packages where my line coverage is nearly 0! A closer look reveals there’s a lot of simple Java beans / POJOs in there and also some exceptions. I’m pretty sure that no one cares about testing their getters and setters and their exceptions constructors, right?
I find myself nowadays mixing a lot of JVM languages: I write a lot of “core” code in Java, as I prefer the verbosity of it somehow, but then I find myself a lot of times I just need a lot of utilities or “quickies” around this code — be it for unit testing purposes or to provide nicer interfaces to the outside. And that’s when I switch to Groovy language and its syntactic sugar so to speak.
I write about 50% of my unit tests in Groovy nowadays — though, granted, partially because of the Spock framework too. Also, I tend to write most of my beans now using Groovy too as it allows for a more terse syntax while achieving the same. I’ll walk you in this post through one of the niceties in Groovy around Java beans.
First of all, I’ll just say, that before using Groovy for my Java beans, I did try out the Project Lombok in Java and relied on its annotations for generating getters, setters, constructors and so on. However, I found that at time to be buggy and it still needed a few annotations in place for what seems to me rather a natural thing when it comes to Java beans: constructor, getters and setters. As such, I ditched it in the end — and I’m sure the framework will improve in time so soon I might have to revisit them and see where they got — and switched to Groovy.
OK so I felt like I needed to share this code with “the world” as I think it’s a good example of a few things:
And now, after this (rather majestic and) long intro, let’s look at what is this small costly mistake I’m talking about.
I have encountered this with Gradle recently and I have struggled to find right away a solution to the issue — it took a bit of reading (more) about Gradle, Checkstyle and some digging in until I found the (rather simple) solution. So I thought I’d post it here for others to hopefully find this when they’re encountering the same problem
The problem that I encountered is the following: I’ve put together a Java project which exposes an API and compiled it as a jar/library. The idea is that other projects I’m working on can use this API, as you would expect.
And since I wrote the library mainly for my own use, as soon as I was done with it I’ve started working on the project which was going to use this API. In the process of working on this, I realized that I actually needed to make a couple of small changes to the initial library — so to save me switching in between the 2 projects, compiling, updating dependencies and so on, I’ve used gradle’s multi-project facility. (I’ve actually combined this with Git feature of having sub-projects in a repo, so my main project includes in git the library as a subproject — then I’ve referenced that project as an include in the gradle build files.)
All looks pretty standard so far, no surprises there. Until I run a
gradle clean build in the main project and I get an error regarding to Checkstyle not finding configuration files! WTF??? The library (sub)project compiles fine if I launch gradle in its own folder so where is the problem?
I was asked recently by a friend of mine about what does my “standard” day of work consist of at Netflix. I had to explain to him that it’s hard to talk about a “standard” day as each day sees me looking at different pieces of our infrastructure and requires different challenges to be solved. Still though, I explained that there are a few common denominators throughout a day in Netflix — and in fact throughout a day of work for anyone who works in software development. And one major such common denominator, as I explained to him, is the fact that we, software engineers (“coders” as we are often labelled), spend lots of ours a day looking at lines of code and producing lines of such code. Then testing it, checking if it works, tweaking it a bit, trying it again and so on until we reach perfection or code nirvana At this point we are happy with our work and ready to move onto the next task. (Which more often than not is to deploy that code onto servers, but perhaps I’ll talk about that bit in a separate blog post. For now, I will just concentrate on the fact that we spend a lot of our time in front of countless lines of code — written by us or someone else.)
“Do you guys not get bored?” asked my friend, having listened to my talking about this.
I wrote before about what I think of handwriting — I think this is a dying communication form. And this is in favour of visual communication. See my previous post on whether handwriting is something I need to keep in my brain or not as well, where I was talking about the fact that one picture can actually communicate in millisecond time the same amount of information — without my brain having to give all the commands to my hands to “encode” the message in writing then send a text/email to my friend which then has to engage his brain to decode the letters, assemble the phrase then decode the phrase again and finally interpret the meaning of it; a single image encodes all of that and making it visual makes it much easier for the brain to decode.
This has found me obsessed lately about whether it’s just me thinking this way or is it actually the case that this is happening? And just when i start thinking that maybe it’s just a silly thought of mine, something like this happens:
I have worked recently in Netflix on a project which was hitting one of our Cassandra clusters. (By the way, we use Cassandra here a lot, wherever possible we prefer it to RDBMS, so we got tons of instances running Cassandra.) Part of what my code had to do was to retrieve a set of records and apply some transformation to one field then write the result in an output file. It is such a simple ETL that I haven’t spent too much time on this initially and simply wrote a code which ran a CQL (Cassandra Query Language) to retrieve the fields that I needed and apply the processing and write the output file line by line.
Of course, in doing so, I missed one important aspect: the volume of data (ouch!) This ETL is set to process about 100 million records and even though my code makes sure I only retrieve the columns that I want and not the full row (which would flood the network with a whole bunch of Cassandra columns for which I have no usage!) — it still dragged like a snail when I ran it first time! (I did a quick calculation at the time and it would have taken something like 3-4 days to finish — ouch!!)
Ok, so if you haven’t been watching my activity on GitHub you might have missed this, and as such I feel it deserves a full on blog post. Recently, having joined Netflix, I started using some of their libraries, as to be expected. One of the things that I used pretty much from day one here, was the Genie library. To quote from Genie’s page on GitHub:
Genie is a federated job execution engine developed by Netflix. Genie provides REST-ful APIs to run a variety of big data jobs like Hadoop, Pig, Hive, Presto, Sqoop and more. It also provides APIs for managing many distributed processing cluster configurations and the commands and applications which run on them.
As you can probably figure out from the above, I’m using Genie for querying some of our Hive datastores. And in doing so, I’m using the Genie client code which Netflix provides with this package — available in Github: https://github.com/Netflix/genie/tree/develop/genie-client
However, having looked at the sample code they provided I realised this can be actually improved. I spoke with the folks here who are looking after the Genie project and it transpired quickly that indeed the client library is in need of some lovin’. So I set off and put together a pull request (https://github.com/Netflix/genie/pull/116). This has now been merged into the main trunk however I think it needs a bit of attention as I’ve seen code presented in this project used elsewhere which can be improved based on the changes I put together in that pull request. This blog post will walk you quickly through these changes — if you are using pieces of code from the client’s code in GitHub, it might be worth reviewing your code and see if my changes can be applied in your project too.