Rails migrations are contextual

So sometime ago i added a foreign key column as a rails migration which creates an index on that column as well: through the add_foreign_key method. This index is of the form rails_fk_some_hex Unfortunately i forgot to add a unique constraint with the index which was a necessary use case.

I thought ‘no big deal’ i’ll just add that now, i ran:

add_index :table_name, :column_name, unique: true, index_name: :some_name

and viola it worked on my local machine like a charm. It even removed the previous index on that column that the foreign key migration created.

but running this on staging. I had 2 indexes, 1 unique and 1 non unique -_- .

Little debugging and i could confirm a pattern:

If these 2 migrations occur in the same rails db:migrate then the add_index migration also overwrites the index created by the foreign_key migration. However in different migrations, it just adds another index.

Solution:

I’m not sure if this is a feature or a bug. To solve for the cases where migrations run together(dev machines) or on servers where they are run separately I used remove_index if index_exists? .

Conclusion:

This isn’t something “breaking or a huge discovery”, just something weird i found in my Rails Journey and something you should be aware about.

Final Submission

This program was one of the best learning experiences I’ve ever had. During the entire GSoC phase i was able to contribute to mainly two repositories of the coala organisation, A sub org under the Python Software Foundation.

These include commits on the coala repository:

https://github.com/coala-analyzer/coala/commits?author=abhsag24

and the coala-bears repository:

https://github.com/coala-analyzer/coala-bears/commits?author=abhsag24

coala-bears also contains a branch which hasn’t been merged into master yet, All my commits on this branch can be found here:

https://github.com/coala-analyzer/coala-bears/pull/653/commits.

I’ve been really grateful to my mentor Fabian, and the admin of my sub-org, Lasse, I’ve had the most enriching discussions with these people 😀 , Cheers to coala, FOSS and the entire GSoC experience, i’ll definitely look forward to working beside these people after the program as well.

Breaking Lines

As the GsoC period comes to an end i have only two major tasks left to do, one being making a LineBreakBear, which suggests line breaks, when the user has Lines of code that are more than a max_line_length setting, and the Other is adding Indents based on Keywords in the IndentationBear.

This week i was able to device a simple algorithm to suggest line breaks. If you’ve followed my blogs you’d know that there’s something I like to call an encapsulator :P, it’s a fancy name for different but not all types of brackets. So the algorithm is as follows:

Get all occurences of lines which exceed max_line_length
Check wether these lines have an encapsulator which starts before the limit
Find the last encapsulator started in this line before the limit.
Suggest a line break at that point with the new line being indented in accordance with the indentation_width setting.

Now this algorithm is really simple and does not consider border cases such as hanging-indents.

Hopefully by the next blog posts i’d have completed my Project. I’ll have lots to share about my experience this year 🙂

week full of refactor

My Project has grown a lot now, we are officially going to support C, C++, python3, JS, CSS and JAVA with our generic algorithms, though they’ll still be experimental owing to the nature of the bears.

The past two weeks were heavily concentrating on refactoring algorithms of the AnnotationBear and IndentationBear, the IndentationBear received only small fixes while the AnnotationBear had to undergo a change in the algorithm, the new and improved algorithm also adds the feature of distinguishing between single-line and multi-line strings while earlier there were just strings.

The IndentationBear is almost close to completion barring basic things like:

It still messes up your doc strings/ multi-line strings.
Still no support for keyword indents.

the next weeks efforts will go into introducing various indentation styles into the bear and fixing these issues, before we move on to the LineBreakBear and the FormatCodeBear.

tabs spaces tabs

Last time i was able to come up with an algorithm to indent python code or basically code without un-indent specifiers. This time the challenge was tackling hanging indentation.

Now hanging indents in terms of word processing occur when all the lines except the first line are indented. In code terms it is something like this:

some_function(
    param1,
    param2,
    param3)

here param1, param2, param3 are indented while some_function is not.

Again the algorithm i use do hanging indentation is pretty straight forward, in simple terms it is:

Check if there is text to the right of paranthesis.
If there is, indent all lines till closing paranthesis that the column right after paranthesis.
Otherwise calculate the indentation relative to some_function and indent all later prams to that level.

This is the broad algorithm i use.

Though the difficult part was actually aligning in the file. N0w i align my files in the correct indentation levels(barring hanging indents) by:

remove all whitespaces to the left of every line.

I have a list called indentation_levels which has indentation corresponding to line number, and also a variable insert which is either ‘\t’ or tab_width*’ ‘ where tab_width is the number of spaces to indent

to each line add insert*indentation_level[line] to the left of the line.

basically line -> insert*indentation_level[line] + line .

now the problem was to insert absolute_indentation_levels in between normal indentation.

So whats the problem? just add number of spaces along with the normal spaces right? WRONG!

there’s a difference between

\t        \t
and 
        \t\t
are different

real life example:

class {
\tfunction(param1{
\t        \t do_something();}
\t         param2)
}

i was able to accomplish this by:

storing the previous indentation.of a block
then adding previous indent + hanging indent + indent level of that line.

do tell me if you find any fault in these algorithms, my work has recently been merged and can be found as the IndentationBear in the master branch of the coala-bears repositiory.

Python, indentation and white-space

Updates:

so at the time of the last update i was able to do basic indentation whenever a start and an end indent specifier was provided, this time around i’m working on stuff when the end-indentation specifier is not provided, for example languages like python

def func(x):
    indent-level1
indent-level2

here we can see that there is no specifier that an unindent is going to occur, so how do i figure out what all lines are a part of one block?

Well the answer is very simple actually, i look for the start indent specifier which in case of python it is the very famous: ‘ : ‘. Now after i find the start of indent specifier, the next step is to find an unindent, in the previous example the line containing ‘indent-level2’ unindents, and voila we have our block, starting from the indent-specifier to the first unindent, easy right? The answer to that is NO, nothing’s that easy.

python doesn’t care about white-space:

well as we all know this isn’t true, python does care about white-space, but not as much as we thought. Python only cares about white-space to figure out indentation, anything else is pretty much useless to it, for example:

def func(x):
    a = [1, 2,
3, 4, 5]
    if x in a:
        print(x)

this is a pretty valid python code, which prints x if x is an integer between 1 to 5. What is odd about this examples is, that as we know in python everything has to be indented right? and this breaks that rule! go ahead try this on your own, it works! So no even in python not everything has to be indented, a simpler example could have been:

def func(x):
    a = 1
# This comment is not indented
    print(a)

does it matter if this comment is not indented? absolutely not! this is a very valid python code as well.

The Problem:

so how is all this related to my algorithm? as you can see in the second example, the line ‘# This comment is not indented’ unindents and my algorithm is searching for unindents, hence breaking my algorithm, as it would think that block starts from ‘def func(x):’ and end at ‘# This comment is not indented’, also in the first example it would find that the line ‘3, 4, 5] ‘ unindents which would again break the algorithm.

The Solution:

The Solution is quite simple in theory: Just be aware of these cases. But that changes the algorithm completely it goes from:

check first unindent
Report block as line containing specifier to the line which unindents

check if case of unindent
check if this line is a comment
check this if line is inside a multiline-comment
check if this line is inside paranthesis() or square-brackets []
If true repeat from 1
else report block.

So the final algorithm is my working solution as long as we are not able to find some problem in that as well. You can follow all the code related to this algorithm in my PR.

Next steps:

Next steps are absolute indentation, hanging indents, keyword indents and an all new bear in the form of the LineLengthBear.

All of this looks really exciting, as i see my once planned Project come to life, i really hope all of this is useful someday and people actually use my code to solve their indentation problems :).

Exploring the world of indents

My Project so far:

The coding phase has begun and unfortunately i wasn’t able to start right away, but I had already done some part in the community bonding period and was able to work on another aspect of my project before the week got over, so now finally let me introduce you to my project which is *drumrolls* Generic Spacing Correction. I know it doesn’t sound very exciting right?But believe me the “Generic” makes it pretty exciting.

Demystifying the name, my project aims to do what you already see in most of your editor programs, be it natively or through plugins; which is automatically indenting your code. It doesn’t stop there though, it indents your code but it’ll indent it regardless the language you use! So you basically indent your C files and python files by the same algorithm! Bye-Bye ctrl + I? Maybe but there’s a lot of work to do.

See, every language style guide follows some basic rules when it comes to indentation, one of them being: there are levels of indentation and each level “means” being part of the same context.

So I just have to identify these levels. Easy. Right? Well for a particular language yes it’s not so difficult, but when it comes to managing the same for all languages out there, then the task becomes a little daunting.

Here’s how i plan to tackle these problems:

First of all my algorithm doesn’t support all languages from the start, it’d support basic languages like C, C++, JAVA maybe even ruby. This is because unlike languages like python, these languages have markers to specify when to start and end indentation.

So far i’ve been able to come up with a very basic implementation, though it still lacks features of indents based on keywords and also absolute Indentation, Like:

if(condition)
    indents

and

int some_function(param1,
                  param2,
                  param3)

other than it works for basic indentation of blocks specified by those indent markers. my PR has all the code to this basic indentation algorithm and is still under-review/wip.

Later on i’d like to make this algorithm configurable to the extreme.

Apart from the basic functionalities like hanging indents and whatnot, it’d be nice to have an algorithm which is configurable to all styles of indentation.

What is Style of indentation?

Apparently there are many ways of correctly indenting your code, and it’s upto
the Community what type of indentation they want to follow.

For example:

if(something){
    // code
}

if(something)
{
    // code
}

if(something)
  {
    // code
  }

all of these are ways indenting an if block. None of these is wrong and it’s
entirely up to you which style you prefer, a generic and versatile algorithm would support all three via configurations.
Sadly my basic implementation doesn’t support the third kind yet.

All in all it’s just the beginning and i’m really excited on how the project develops and what type of algorithms i’m able to deliver, hopefully i’ll be back next time with a more functional and useful algorithm.

GSoC: My first blog ever!

My proposal for GSoC ’16 has been selected and so my journey towards an awesome summer has begun.

When i started contributing to open source i had only the bookish knowledge learned from school and college and only a little bit of practical experience, i had no idea how a real life software was run and maintained, then i found syncplay . It was a software I used daily and got to know after a little while that it was Open Source, by that time i knew what Open Source meant but I hadn’t dove deep into it yet, so i began making some changes to it’s code-base(all hail open-source!).

A week or two later i had learned a lot, i learned about decorators, socket-programming, the twisted framework, utf-8 encoding and also some of the nitty-gritties of coding. I had known about the GSoC Program(one of my friends had participated earlier), and got to know(from the contributors) that syncplay wouldn’t be participating, so i started searching for organisations that’d be looking to participate and stumbled upon coala which is a static-code analyzer(though saying just this is doing it injustice).

the coala-community in one word is awesome! i have never seen a community in my little experience that is so helpful to newcomers! it took a little time getting used to learning how to operate the software at first but once i got used to it, it was and is still an amazing experience. Working beside sils, AbdealiJK, Makman2 and all of the coala community has been an awesome learning/entertaining experience till now and i can’t wait to imagine what it would be like once the summer gets started!.

As far as my Project goes it deals with creating Indentation algorithms that would be language independent. It would automatically correct indentation(thanks to awesome diff management by coala) and would also look for cases when lines get too long and would even break the lines once it gets completed.

Open source has been a really fascinating experience so far. Apart from GSoC i’m looking to contribute to a lot of open source in my spare time i’d like to finish my PR for syncplay, and find some other projects to contribute, i have a few ideas myself, let’s hope they see the light of day ;).

All in all I’ve learned a lot from these few months contributing. I’ve developed habits like watching talks, reading blogs, all of them being so informative! I’ve learnt a lot about programming writing not only code, but writing efficient, well formatted code. I’ve had glimpses of frameworks, learnt new types of software. Being an IT student was never as exciting as it is now.