TIL: Mongo Sucks

MongoDB $nin and $ne and $exists: false queries are super expensive. I mean there is a point at which they are basically unusable. And I learned it the hard way, having developed a service, that relies heavily on this kind of queries to emulate queue-like workflow. 5+ million documents later and I’m hurriedly adding indexes for all flags that take more than 60 seconds to query for.

Information on this issue is limited to a short FAQ entry at Mongo docs and a couple StackOverflow answers. That’s it. Come on, it should be the first thing people see, when they open Mongo docs – a huge dialog you should scroll through three times before you can click accept.

Remote Debugging with PyCharm

I’m working in a project now, that requires a certain (server) environment to run, hence it is developed on my local machine and then gets deployed on remote server. I thought I’m gonna say bye bye to my favorite PyCharm feature, namely the debugger, but to my surprise remote debugging has been supported for years now. It took some time to figure out (tutorials online are a bit ambiguous), so here is a short report on my findings.

For the sake of this tutorial let’s assume the following:

  • Remote host: foo_host.io
  • Remote user: foo_usr (/home/foo_usr/)
  • Local user: bar_usr (/home/bar_usr/)
  • Path to the project on the local machine: /home/bar_usr/proj

Here goes the step-by-step how-to:

  1. First we need to set up remote deploy, if you haven’t done so already. Go to Tools → Deployment → Configuration. And set up access to your remote server via SSH. I’d use:
    • Type: SFTP
    • SFTP host: foo_host.io (don’t forget to test the connection before applying)
    • Port: 22 (obvously)
    • Root path: /home/foo_usr
    • User name: foo_usr
    • Auth type: Key pair (OpenSSH or PyTTY)
    • Private key file: /home/bar_usr/.ssh/id_rsa (you’d need to generate the key and ssh-copy-id it to the remote machine, which is outside of the scope of this tutorial).
  2. Go to Mappings tab and add Deployment path on server (pehaps, the name of your project)
  3. Now under Tools → Deployment you have an option to deploy your code to remote server. These first three steps could be replaced with simple Git repository on the side of the server, however I sometimes prefer this way.
  4. Now when you have the deployment set up you can go Tools → Deployment → Upload to ..., note however, that it deploys only the file you have opened or the directory you selected in the project view, so if you need to sync the whole project just select your project root.
  5. I use virtualenv, so at this step I need to ssh into the remote machine and set up virtualenv in your project directory (/home/foo_usr/test/.env), which is outside of the scope of this tutorial. If you’re planning on using the global Pyhton interpreter, just skip this step.
  6. Now let’s go File → Settings → Project ... → Project Interpreter. Using gear button select Add Remote. The following dialog window would let you set up a remote interpreter over SSH (including remote .env), Vagrant or using deployment configuration you have set up previously. For the sake of this tutorial I’m going to put something like that there (using SSH of course):
    • Host: foo_host.io
    • Port: 22 (which is there by default)
    • User name: foo_usr
    • Auth type: Key pair (OpenSSH or PyTTY)
    • Private key file: /home/bar_usr/.ssh/id_rsa
    • Python interpreter path: /home/foo_usr/proj/.env/bin/python
  7. If you set up everything correctly, it should list all the packages installed in your remote environment (if any) and select this interpreter for your project.
  8. Now let’s do the last, but the most important step: configure debugging. Go to Edit Configurations… menu and set things up accordingly. For our hypothetical project I will use the following:
    • Script: proj/run.py (or something along these lines)
    • Python interpreter: just select the remote interpreter you have set up earlier.
    • Working directory: /home/bar_usr/proj/ (note that this is working directory on local machine)
    • Path mapping: create a mapping along the lines of /home/bar_usr/proj = /home/foo_usr/proj (although this seems pretty easy, it may get tricky sometimes, when you forget about mappings and move the projects around, be careful).
That’s it. Now we should have a more or less working configuration that you could use both for debugging and running your project. Don’t forget to update/redeploy your project before running as the versions may get async and PyCharm would get all whiny about missing files.

Short History of my Relationship with Lenovo ThinkPad

My own ThinkPad X201.
My own ThinkPad X201 workhorse
For those, who have limited time, this post comes down to the following statement: Lenovo Thinkpad X201 is the best X-series Thinkpad created yet (although after a somewhat heated discussion at Reddit, x220 looks better). What follows is my attempt at proving this point with my merely anecdotal evidence. I’m funny like that. Here comes a short story of my relationship with Lenovo Thinkpad X series.

In 2013 I’ve been working as a technical writer (more technical than a writer actually) in a medium-sized web-slash-mobile startup and the Macbook, they’d given me, failed and I decided to try something new. At that time I got increasingly interested in Lenovo Thinkpad (yeah, it hadn’t been IBM for quite a long time already). A couple of my colleagues had these X220 machines and they seemed pretty solid and professional, especially with all kinds of Linux installed on them (I worked with a bunch of Python devs and everybody used their favorite flavor of Linux). My transition from Macbook to Thinkpad was also dictated by how Macbook wouldn’t let me use i3wm (which I was completely sold on at the time) as the main WM. So I went to my manager an he approved my order. The problem was I wasn’t really familiar with modern ThinkPads then and ordered 14″ model (thought I could use all the extra screen space). I figured any Thinkpad will do. It was my mistake.

I got T431S, which was admittedly quite expensive at the time, but didn’t look like Thinkpad at all. If anything it resembled plastic version of Macbook. It had a rather disguising chiclet (island-type) keyboard, no LED indicators, thinner body and as a result much less ports (although for the record I do understand S stands for slim). The only thing it had in common with the previous generations of Thinkpads was the clit, which was kinda useless without the additional row of buttons, the device actually had no touchpad buttons at all as it mimicked the Macbook-style platform touchpad (awful, awful trend, actually). The hardware was of questionable quality and it gave me lots of headache on Linux (especially WiFi) despite the ThinkPads traditionally being considered one of the best laptops, when it comes to compatibility with Linux. I worked on this machine until the company went under, and got used to it somehow, but it never lived up to the image of Thinkpad I had in my head.

Even after that I didn’t give up on the Thinkpad series completely, though it clearly went downhill with every subsequent model. My wife got herself X230 at work and as I got to play with the device a bit, I had an impression that this is not as bad as 431S, so as the line moved forward I decided to go in the opposite direction. At that time I started working in a medium enterprise infosec company and they had Thinkpads all over the place, and most of these were the Thinkpads as I expected them to be from the day one. These were X201 models. They aren’t as outdated as the earlier ones but they have all the right features. Here is a short comparison between some of the latest X series models:

X201 X220 X230 X240
LED Indicators 9 on the front and 3 are mirrored on the back. 3 on the front and 2 on the back. 2 on the front and 2 on the back. None (!)
Keyboard Classic Classic Chiclet Chiclet
Ports VGA, Ethernet, 3 USB, separate ports for mic and headphones, phone line port, ECSC slot. VGA, Mini DisplayPort, 3 USB (1 USB 3.0), combo audio jack, media card reader slot, ECSC slot. VGA, Mini DisplayPort, 3 USB (2 USB 3.0), combo audio jack, media card reader slot, ECSC slot. VGA, Mini DisplayPort, 2 USB 3.0, combo audio jack, media card reader slot, ECSC slot.
ThinkLight (Keyboard Flashlight) Yes Yes Yes The keyboard is backlit instead. Get your tongue out of Apple’s ass, Lenovo!
Clit Buttons Yes Yes Yes Touchpad is a platform with areas for clit buttons, which is kinda sad.*

* – to be fair Thinkpad X250 actually went back to having hardware buttons, so X240 is not the whole new tendency, but rather disappointing stumble.

So, to sum it up for X201:

  • There is the right number of LED indicators (X220 and X230 have less and X240 seem to have none whatsoever) and they are mirrored on the back of the machine, which is convenient, when the lid is closed.
  • The classic Thinkpad keyboard is just right for coding. No trendy chiclet bullshit.
  • Ports and slots is the area, where the age of the machine shows the most. It doesn’t have any USB 3.0 ports and Mini DisplayPort would be actually nice. Still, it’s much better than X240.
  • There are two rows of buttons, one for clit mode and the other for the touchpad. Although I work with clit most of the time, I find having an additional bottom row rather convenient, yet I’d probably go with no touchpad at all.
  • Flashlight!
  • The only problem with X201 model for me is that it’s not available for sale officially anymore (at least where I live), I even tried to buy out my office X201, when leaving the company but they wouldn’t let me. So I found a place that sold used ThinkPads for a reasonable price and bought one from there. This machine is pure magic, and it doesn’t matter that it’s a bit outdated. It has i3 CPU (which is still fair these days), up to 8GB RAM (which is usually enough), extended 6 cell battery makes up for its age (easily gives me 5 or 4 hours of relaxed coding) and overall design hints at the times when the word Thinkpad meant something more than “an ugly plastic Macbook knockoff”. Without much exaggeration I can say, that in 12.5″ X line of ThinkPads (at least to me) the X201 model seems greatly superior to anything made before (due to being relatively modern) or after. It’s still relevant today and has the potential of being developer’s muse (fetishist talking) and workhorse.

    An update is due: although I still think X201 is one of the best ThinkPad X-series machines, after a heated Reddit discussion X220 seems to be an even better model with all the advantages of X201 (except the amount of LEDs 😉 ), plus newer hardware and better screen. You probably should consider that machine if you are shopping for classic ThinkPads.

    Humble Collection of Python Sphinx Gotchas: Part I

    Gotcha 1: Getting Started with Sphinx Theming

    Update 02.10.2014: It seems absolutely obvious to me, but apparently it is not that obvious to some: you can’t do anything about your Sphinx theme without some basic CSS and HTML skills. You can only change some of the theme parameters, if the developer was kind enough to add some, but it is well covered in the official documentation.


    Customizing Sphinx visuals was somewhat upsetting to me at first, since the process is not straightforward and documentation is scarce. It could be a little bit user-friendlier. However, as soon as you get a grip on the basics, it gets pretty smooth and simple. You may have already tried copying over the default theme only to discover, that there is nothing particularly useful in there for an inquisitive scholar. Only one of the standard Sphinx themes is in fact complete, the rest of them simply inherit its properties to add some minor alterations. This theme is called Basic and it’s a minimal sandbox template, the only theme that could be helpful for getting to the very bottom of Sphinx customization. Later you’d be able to inherit it and create a template, consisting only of alterations, but for a start it’s OK to copy the Basic in its entirety.

    Hopefully, you’ve already created a folder for your Sphinx project and initiated it by issuing:

    $ sphinx-quickstart
    

    Or you may have an already existing Sphinx project, you want to theme — it’s up to you. In your project folder create the _themes directory and then rename and copy the theme folder there. Basic theme, perhaps, should be located in site_packages folder of your active Python install. On my Mac it’s: /Library/Python/2.7/site-packages/Sphinx-1.2b1-py2.7.egg/sphinx/themes/basic/, but Python on Mac OS X is just weird. If you’re using Linux (/usr/share/sphinx/themes/basic/ on Debian) or Windows, you should look it up yourself.

    Next step would be to change conf.py of your project, accordingly. First, we should make sure, that the following line is uncommented and correct:

    html_theme_path = ['_themes']
    

    Don’t forget to check the value of the html_theme parameter above:

    html_theme = 'renamed_basic'
    
    Basic Theme
    Basic theme without customization looks like vanilla HTML with no stylesheet whatsoever.

    As of now you may start making alterations to the theme. You will find that HTML files in there aren’t really HTML files, but templates (Sphinx uses Jinja template engine) with some staff automatically inserted on build. You can combine these automatic tags with basic HTML tags and as soon as you figure out how it works, you could move some of the interactive tags around, or get rid of some of them altogether. Don’t forget to check whether you’re not breaking anything though. Most visual aspects of Sphinx theme are modified through the main CSS file, which is located in the static folder. For Basic theme it would be basic.css_t. Notice, that t in the extension brings to our attention the fact, that this is a template. Other than that it could be viewed and edited as a simple CSS file. If you’re interested in the values, provided by Sphinx templates and how you cam make use of them, consult the official documentation. If you want the main CSS file to have some other name, you can change that in theme.conf, there are also some other settings, that could be of interest.

    Gotcha 2: Spoiler

    If you play with Sphinx templates for some time, you may want to start adding something more interactive. We could easily implement these kinds of things trough JavaScript or jQuerry (which is kindly included in the package).

    Let’s go trough a little example project, to recognize, what could be achieved with combined power of Sphinx and JavaScript. You may notice, that as for common elements in HTML themes there is usually a way to address them in JavaScript, it’s not that easy to introduce new classes and wrap some parts of the text into them. There is no custom div tag in Sphinx, unless you do that as .. raw directive which is not really a native way, and breaks lots of things. What I needed to do was to wrap some section of text into a div and make it collapsible on button pressed. You could achieve that by creating your own admonition and assigning it to certain HTML class.

    For example:

    .. admonition:: Request
    		:class: splr
     		Request example and parameters.
    

    You can then easily reference this admonition by its class in your JavaScript. For example you could put this jQuerry script to your page.html template:

        <script type="text/javascript">
        $('.splr').css("background-color", '#F0F0F0');
        $('.splr').css('box-shadow', '0px 0px 1px 1px #000000');
        $('.splr > .last').hide();
        $('.splr > .expanded + .last').show('normal');
        $('.splr > p').click(function() {
            $(this).find('.last').hide();
            $(this).toggleClass('expanded').toggleClass('collapsed').next('.last').toggle('normal');
        });
        </script>
    

    If implemented correctly, this script should turn the splr admonition into a collapsible drawer. I’ve been actively using this, when documenting HTTP APIs, since it’s very helpful to hide JSON responses by default. Note, that you could also use this method for special CSS effects. Imagine, if aside from usual note, warning and tip you could have yellow, blue and purple boxes for whatever reason you can think of? Well, it couldn’t be any simpler. Admonitions are good default containers for some parts of your text, that should differ in design or function from the rest of the page.

    Gotcha 3: Interactive TOC in Sidebar

    Not the worst case, but still a valid comparison of TOC trees in Default theme and our JavaScript enhanced Basic. Note, that on the right the subcategories may be collapsed and opened dynamically.
    Not the worst case, but still a valid comparison of TOC trees in Default theme and our JavaScript enhanced Basic.

    You may have noticed, that Sphinx often adds TOC to sidebar automatically, if it’s not explicitly placed in the page itself. While this is certainly a very useful feature, sometimes things get out of control. I didn’t use the worst case in the picture on the right, but it could get up to innumerable 1st level sections, each of them could have a number of subsections and so on. Sometimes it is a clear sign, that the page should be reorganized into multiple standalone pages, united by a category, but it’s not always possible or needed. It’s perfectly alright to have a long and deep TOC in some cases and Default Sphinx theme is terrible in that regard (as of 2014 it is fixed and children categories are shown only for the currently selected one).

    You could use an updated version of the algorithm from the previous example to collapse some parts of the TOC by default. Note, how this script works with ul and li tags of the TOC tree list. Some stuff is applied recursively on the highest level of the list, some — on the subsequent levels. Especially noticeable with different styles applied to different levels of the list, so that you could tell whether it is 1st, 2nd or even 3rd level title. Here is the full script:

        <script type="text/javascript">
        $('.sphinxsidebar').attr('position', 'fixed');
        $('.sphinxsidebar').css('position', 'fixed');
        $("h3:contains('Table Of Contents')").css('border-bottom', '1px');
        $("h3:contains('Table Of Contents')").text('TOC:');
    
        $('.sphinxsidebarwrapper li').css('background-color', '#B8B8B8');
        $('.sphinxsidebarwrapper li').css('box-shadow', '0px 0px 1px 1px #000000');
        $('.sphinxsidebarwrapper li').css('color', 'white');
        $('.sphinxsidebarwrapper li').css('border-color', '#000000');
    
        $('.sphinxsidebarwrapper li > ul > li').css('background', '#D0D0D0');
        $('.sphinxsidebarwrapper li > ul > li').css('box-shadow', '0px 0px 1px 1px #000000');
        $('.sphinxsidebarwrapper li > ul > li > ul > li').css('background', '#F0F0F0');
    
        $('.sphinxsidebarwrapper li > ul').hide();
        $('.sphinxsidebarwrapper li > .expanded + ul').show('normal');
        $('.sphinxsidebarwrapper li > a').click(function() {
        //hide everything
        $(this).find('li').hide();
        //toggle next ul
        $(this).toggleClass('expanded').toggleClass('collapsed').next('ul').toggle('normal');
        });
        </script>
    

    Perhaps, it won’t look very well, but it is a very simplified version to illustrate the concept. If you get the idea, you may alter or add .css methods to achieve more plausible visuals. You could also work on lower title levels, but you will have to figure it all out for yourself. In this version algorithm works best only with three headings levels from <h1> to <h3>, but it could definitely handle more.

    That’s it. I’ve compiled these examples as a full-featured theme project on GitHub. I’m going to polish it to some extent and perhaps, implement more interactive stuff over time. Feel free to contribute to this little project. Nope, never happened. Well, it did happened actually, but it was no good, sorry. If you come upon any issue with this little gotchas of mine, let me know. Sure, I’ve tested everything myself, but you never know. Also, you’re free to use any of these examples as a building block in your work with no attribution, since they’re rather generic and simple.

    Why Static Site Generators aren’t Good for My Blog

    I would be speaking about my own experiences and you may have noticed my in the title, which is there to remind you about the subjective nature of this article. I don’t dare to speak about your blog or any other blog out there, but my own. I have no intention to convert you to my side, yet I would be very glad to see some of the like-minded people out there. I know they exist. My own research on the topic has unveiled, that although static site generators have this really substantial and zealous following, there are sober voices in the crowd, appealing to common sense. This is exactly why Kevin Dangoor went “From WordPress to Octopress and Back” or why Michael Rooney is “Migrating from Octopress to WordPress” — in the exact opposite direction from the majority of switchers.

    Masses Migrating from WordPress to Octopress
    Illustrating the distinctive trend among majority of switchers.

    It doesn’t boil down solely to WordPress vs Octopress debate, as the issue in hands is much broader and may be represented as dynamic site engines vs static site generators. If you’re not aware of the difference between the two, here are the basics: with dynamic site your content is generated dynamically by an application running on the server side, static sites, on contrary, are pre-generated or written outright in HTML. Basically, dynamic sites are a web-applications that could change their behaviour instantly, depending on input and other factors, while static sites: well, they’re just HTML files passively sitting there, waiting for you to open and read them. Sure, with introduction of jQuerry, Java Script and HTML 5 to the mix, the difference gets a little less distinctive, but let’s stop at this level.

    So, static vs dynamic. It could be virtually anything: Tumblr vs Hyde, Blogger vs Pelican, Movable Type vs Jekyll, etc. Major differences between the two models are more or less the same. It means that we should be really comparing the models themselves, not their instances.

    So, what are the lucrative advantages of the static generation model? What makes people switch so quickly and without looking back? I came up with 3 most important reasons:

    1. Almost endless customizability.
    2. It’s mostly plain markup text files, that you can store in a Git repository.
    3. Increased loading speed and security (well, no features – no loopholes, obviously).

    All three are valid points and at some moment I went down the static path myself with all three in mind. First I went with Jekyll, then switched to Octopress. At some point I even tried to make Sphinx-generated site (sic!) work as a personal web-page, but lack of blog awareness and increasing complexity made me abandon this idea. When they say that running this kind of site is the easiest things to do, this is complete nonsense. In terms of comfortable workflow, I only can say a couple good things about pure Jekyll paired with GitHub Pages, but the result was so raw and required so much customization, that straightforward workflow was hardly an advantage. It is positioned as a toy for true geeks, tinkerers, but I don’t see how anyone could really benefit from this kind of tinkering (yeah, even geeks and tinkerers). I work in IT and we are here mainly to solve problems, not create tons of complimentary issues. Instead of reinventing the wheel, you could as well invest your time into something, that really needs to be done, perhaps — writing good stuff for your website.

    I’m not the only one who noticed it, but for several months I’ve been experimenting with static generators, I’ve hardly written half a post. It was a common problem, as I googled it. People dug so deep into the endless customization and switching routine that eventually they’ve stopped writing. I may sound conservative to some, but I still think that blogging is mainly about writing. Sure, to some extent, this problem is applicable to dynamic platforms too (ping pong between WordPress and Blogger, anyone?), but with static generators it grows to catastrophic proportions. They are the ultimate time drain for nerds and wannabes. Sure, if your time worth basically nothing and constant tweaking your blog is the best part about having it, be my guest. To me it looks pretty much like that little joke from Fry&Laurie:

    Oh, yes, my boyfriend’s a real DIY enthusiast, DIY mad.
    He’s decorated the whole room and he’s put up all these bookshelves, and now he’s writing all these books to put on them.

    Another seeming advantage of static site generators is their reliance on simple text, that could be utilized in distributed version control systems like Git. Most switchers assume, that they already use plain text and Git for code, why not use it for a blog? The problem is, that it also complicates things instead of easing them. Each generator has its own super-easy-workflow™ with different special folders, commands and scripts — quickly it becomes a mess. In this regard Git is basically one more noose on your neck. I’d focus on one especially nasty implementation of Git in such workflow — Octopress. You should fork the original Octopress repo, the _deploy folder will be used for deploy to the pages repository and sources should be committed to the special source branch. Now imagine what would happen if a somewhat major update gets pushed to the upstream of Octopress and your copy has been significantly modified over time. As someone has put it: “Octopress is great, until it breaks”. If you have some experience with Git and seen the Octopress workflow, you may imagine the hell it could possibly be. Actually, Octopress is itself a heavily modified version of Jenkins. No offence, but it all seems like Rupe Goldberg machine to me.

    If you google it, you will find lots of people, performing full-featured benchmark tests of WordPress vs Octopress with a complete disregard for the principal differences between the two. People start speaking of security and speed benefits of static sites, forgetting about all the advantages of dynamic sites, that come at the price of increased complexity and bloat (yes, I do think WordPress is a tad bloated). Imagine the situation: you need to jot down a post draft on a public or someone else’s PC? Will you be cloning the repo, installing ruby, Octopress — setting up all the environment to write a short post? What about mobile support? Should you attempt to clone Octopress to your mobile phone? What about preserving drafts in the cloud without publishing them, but having access to them virtually from anywhere and anytime? Can you really put a price on that? People start using Evernote or similar service for drafts and such, but does it really worth introducing another tool to your workflow? Does mobility and availability worth another couple seconds of load time? My own choice is comfort and efficiency. I want my blog to be complimentary to my technical endeavors, not the other way around.

    I’ve started thinking that less is actually more a long time ago and it may apply to blogging as to almost any other area of our life. You may notice, that I don’t even use the standalone WordPress install, but the pretty limited hosted WordPress site. I prefer to pay engineers at WordPress $13 for domain mapping and settle for less choice in themes, plugins and other options to focus on writing. We’re all too lost in the world of different platform and workflow options these days. Google it and you will see hundreds of rants about why platform A is deliberately better, than platform B, why static sites are better, than dynamic. You almost never hear that they help you in writing, no. It’s all about SEO, storage space, customization, load speed and other insignificant stuff, not directly related to blogging. We’re too obsessed with form and seem to forget about goddamn content. But, as I said before, it is all entirely subjective. You may still go down the static route, customize the ass out of your blog. You may even spend several years on writing your own static or dynamic blog engine from scratch, that will sure be absolutely unique and different from anything ever done before. Yet, I’m writing this post in a beautiful distraction-free WYSIWYG editor and my draft will be preserved online, when I press the Save button and no rake deploy is needed ever again.