Aspiring Photographers are Quitting Instagram en Masse and Here’s Why

Full disclaimer: the title is a total clickbait. Still I’m very concerned about the current state of Instagram and the direction the site has taken. Here are some of my grievances:

  • I haven’t seen anyone but bots in a while. Sure, that’s a gross exaggeration, but it certainly feels like actual people are in the minority on the site. Most accounts that like or follow me are spambots or some low-tier SMM guys from developing world.
  • My major grievance is lack of posting in official API. Sure, there are unofficial private API wrappers, but why should users jump through the hoops? Limiting access to CRUD operations in public API is walled garden approach at its finest.
  • No links. Instagram prohibits adding clickable links and copying in both descriptions and comments. I understand they’re doing it to minimize spam, but it’s like getting rid of intestines to avoid diarrhea. Also, see the first point. How much should it take before ineffectiveness of this approach is evident? In the meanwhile there is no way for the regular user to share a link to a bigger story.
  • Instagram still reduces the quality of pictures and ruins your videos. Sure, it’s much better now than it used to be, but it’s still far from ideal. Plus, what’s up with quality of Instagram pics on Facebook? Is this by design? Because I have no other explanation how it can stay unfixed for so long.
  • It doesn’t cater towards its original target audience – the amateur photographers. Instagram is no longer a network for good phone photography (or attempts at that), most pics one sees today are selfies, which says a lot about the current target audience of the service. Introduction of Stories (which is 100% Snapchat rip-off) cemented this trend.

I’m still using the service for sharing what I deem interesting shots and occasional videos, but I’m also using Flickr as some kind of mirror feed and to be honest Instagram starts to feel like dead weight.

TIL: Mongo Sucks

MongoDB $nin and $ne and $exists: false queries are super expensive. I mean there is a point at which they are basically unusable. And I learned it the hard way, having developed a service, that relies heavily on this kind of queries to emulate queue-like workflow. 5+ million documents later and I’m hurriedly adding indexes for all flags that take more than 60 seconds to query for.

Information on this issue is limited to a short FAQ entry at Mongo docs and a couple StackOverflow answers. That’s it. Come on, it should be the first thing people see, when they open Mongo docs – a huge dialog you should scroll through three times before you can click accept.

Overriding Default Werkzeug Exceptions in Flask

Let’s play a game here. What HTTP code is this exception:

{
"message": "The browser (or proxy) sent a request that this server could not understand."
}

No no, you don’t look at the code in response! That’s cheating! This is actually a default Werkzeug description for 400 code. No shit. I thought something is bad with my headers or encryption, but I would never guess simple Bad request from this message. You could use a custom exception of course, the problem is, that the very useful abort(400) object (it’s an Aborter in disguise) would stick with the default exception anyway.

Let’s fix it, shall we?

There may be several possible ways of fixing that, but what I’m gonna do is just update abort.mapping. Create a separate module for your custom HTTP exceptions custom_http_exceptions.py and put a couple of overridden exceptions there (don’t forget to import abort we’ll be needing that in a moment):

from flask import abort
from werkzeug.exceptions import HTTPException


class BadRequest(HTTPException):
    code = 400
    description = 'Bad request.'


class NotFound(HTTPException):
    code = 404
    description = 'Resource not found.'

These are perfectly functional, but we still need to add it to the default abort mapping:

abort.mapping.update({
    400: BadRequest,
    404: NotFound
})

Note that I import abort object from flask, not flask-restful, only the former is an Aborter object with mapping and other bells and whistles, the latter is just a function.

Now just import this module with * to your app Flask module (where you declare and run your Flask app) or someplace it would have a similar effect on runtime.

Note that you also should have the following line in your config because of this issue:

ERROR_404_HELP = False

I’m not sure why this awkward and undocumented constant isn’t False by default. I opened an issue on GitHub, but no one seems to care.

Remote Debugging with PyCharm

I’m working in a project now, that requires a certain (server) environment to run, hence it is developed on my local machine and then gets deployed on remote server. I thought I’m gonna say bye bye to my favorite PyCharm feature, namely the debugger, but to my surprise remote debugging has been supported for years now. It took some time to figure out (tutorials online are a bit ambiguous), so here is a short report on my findings.

For the sake of this tutorial let’s assume the following:

  • Remote host: foo_host.io
  • Remote user: foo_usr (/home/foo_usr/)
  • Local user: bar_usr (/home/bar_usr/)
  • Path to the project on the local machine: /home/bar_usr/proj

Here goes the step-by-step how-to:

  1. First we need to set up remote deploy, if you haven’t done so already. Go to Tools → Deployment → Configuration. And set up access to your remote server via SSH. I’d use:
    • Type: SFTP
    • SFTP host: foo_host.io (don’t forget to test the connection before applying)
    • Port: 22 (obvously)
    • Root path: /home/foo_usr
    • User name: foo_usr
    • Auth type: Key pair (OpenSSH or PyTTY)
    • Private key file: /home/bar_usr/.ssh/id_rsa (you’d need to generate the key and ssh-copy-id it to the remote machine, which is outside of the scope of this tutorial).
  2. Go to Mappings tab and add Deployment path on server (pehaps, the name of your project)
  3. Now under Tools → Deployment you have an option to deploy your code to remote server. These first three steps could be replaced with simple Git repository on the side of the server, however I sometimes prefer this way.
  4. Now when you have the deployment set up you can go Tools → Deployment → Upload to ..., note however, that it deploys only the file you have opened or the directory you selected in the project view, so if you need to sync the whole project just select your project root.
  5. I use virtualenv, so at this step I need to ssh into the remote machine and set up virtualenv in your project directory (/home/foo_usr/test/.env), which is outside of the scope of this tutorial. If you’re planning on using the global Pyhton interpreter, just skip this step.
  6. Now let’s go File → Settings → Project ... → Project Interpreter. Using gear button select Add Remote. The following dialog window would let you set up a remote interpreter over SSH (including remote .env), Vagrant or using deployment configuration you have set up previously. For the sake of this tutorial I’m going to put something like that there (using SSH of course):
    • Host: foo_host.io
    • Port: 22 (which is there by default)
    • User name: foo_usr
    • Auth type: Key pair (OpenSSH or PyTTY)
    • Private key file: /home/bar_usr/.ssh/id_rsa
    • Python interpreter path: /home/foo_usr/proj/.env/bin/python
  7. If you set up everything correctly, it should list all the packages installed in your remote environment (if any) and select this interpreter for your project.
  8. Now let’s do the last, but the most important step: configure debugging. Go to Edit Configurations… menu and set things up accordingly. For our hypothetical project I will use the following:
    • Script: proj/run.py (or something along these lines)
    • Python interpreter: just select the remote interpreter you have set up earlier.
    • Working directory: /home/bar_usr/proj/ (note that this is working directory on local machine)
    • Path mapping: create a mapping along the lines of /home/bar_usr/proj = /home/foo_usr/proj (although this seems pretty easy, it may get tricky sometimes, when you forget about mappings and move the projects around, be careful).
That’s it. Now we should have a more or less working configuration that you could use both for debugging and running your project. Don’t forget to update/redeploy your project before running as the versions may get async and PyCharm would get all whiny about missing files.

My Take on Yandex Pre-interview Python Assignment

I’ve applied for a junior Python position at Russian internet giant Yandex (very similar to Google). And although my application has been rejected, due to lack of experience, I think their little pre-interview test and my take on that may be of interest to any inquisitive pythonista. Note, that this has never been properly translated into English before, so this is probably exclusive in that regard.

Assignment I

There are two lists of different length. The first one contains keys, the second – values. Write a function, that would create a dict out of these lists. If the key doesn’t have a value – it should equal None, if the value doesn’t have a key, it should be omitted.

def get_dict(list1, list2):
    ret = dict(map(None, list1, list2))
    if ret.get(None, False):
        ret.__delitem__(None)
    return ret

Assignment II

Login should start with latin symbol, contain latin symbols, digits, dots and hyphens, but end only with a latin symbol or a digit. Minimum length is 1 symbol, maximum – 20 symbols. Write a function that checks strings for correspondence with these rules. Think of several methods of solving this problem and compare them.

import re
import time

def check1(login):
    ret = False
    if re.match('^[a-zA-Z][a-zA-Z0-9\-\.]{0,19}(?<![\-\.])$', login):
        ret = True
    return ret

def check2(login):
    ret = False
    if (len(login) >= 1 or len(login) <= 20) and login[0].isalpha() and (login[-1].isalpha() or login[-1].isdigit()):
        for a in login[1:-1]:
            if a.isalpha() or a.isdigit() or a == '-' or a == '.':
                ret = True
    return ret

def compare(login):
    tm = time.time()
    check1(login)
    print(time.time() - tm)
    tm = time.time()
    check2(login)
    print(time.time() - tm)

Assignment III

There are two tables users and messages (I changed names and messages to non-Cyrillic):

users
UID name
1 John Doe
2 Natalie Knaph
3 Johnatan Yozo
messages
UID msg
1 Hello, John!
3 Send me the card, quickly.
3 I’m waiting on the corner of 5th and Lafayette
1 This is me again. Please message me more often.

Create a SQL query, that would return two fields: “User name” and “Total amount of messages”.

SELECT users.name AS "User name",count(*) AS "Total amount of messages" 
FROM users 
JOIN messages ON users.uid = messages.uid 
GROUP BY users.uid

Assignment IV

Suppose you have a generic access.log. How to get 10 most frequent IP-addresses using standard terminal tools? How to do that with Python?

# BASH:
grep -o '[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}' access.log | sort -n | uniq -c | sort -n -r | head -10

# PYTHON:
import sys
import re

all = re.findall("[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}", open(sys.argv[1], 'r').read())
srt = sorted(all, key=all.count, reverse=True)
unq = []
for m in srt:
    if not m in unq:
        unq.append(m)
print unq[0:10]

If you can think of a better way to solve any of these, let me know.