Python, indentation and white-space

Updates:

so at the time of the last update i was able to do basic indentation whenever a start and an end indent specifier was provided, this time around i’m working on stuff when the end-indentation specifier is not provided,  for example languages like python

def func(x):
    indent-level1
indent-level2

here we can see that there is no specifier that an unindent is going to occur, so how do i figure out what all lines are a part of one block?

Well the answer is very simple actually, i look for the start indent specifier which in case of python it is the very famous: ‘ : ‘. Now after i find the start of indent specifier, the next step is to find an unindent, in the previous example the line containing ‘indent-level2’ unindents, and voila we have our block, starting from the indent-specifier to the first unindent, easy right? The answer to that is NO, nothing’s that easy.

python doesn’t care about white-space:

well as we all know this isn’t true, python does care about white-space, but not as much as we thought. Python only cares about white-space to figure out indentation, anything else is pretty much useless to it,  for example:

def func(x):
    a = [1, 2,
3, 4, 5]
    if x in a:
        print(x)

this is a pretty valid python code, which prints x if x is an integer between 1 to 5.  What is odd about this examples is, that as we know in python everything has to be indented right? and this breaks that rule! go ahead try this on your own, it works! So no even in python not everything has to be indented, a simpler example could have been:

def func(x):
    a = 1
# This comment is not indented
    print(a)

does it matter if this comment is not indented?  absolutely not! this is a very valid python code as well.

The Problem:

so how is all this related to my algorithm?  as you can see in the second example, the line   ‘# This comment is not indented’ unindents and my algorithm is searching for unindents, hence breaking my algorithm, as it would think that block starts from ‘def func(x):’ and end at ‘# This comment is not indented’, also in the first example it would find that the line    ‘3, 4, 5] ‘ unindents which would again break the algorithm.

The Solution:

The Solution is quite simple in theory: Just be aware of these cases. But that changes the algorithm completely it goes from:

  • check first unindent
  • Report block as line containing specifier to the line which unindents

To

  • check if case of unindent
  • check if this line is a comment
  • check this if line is inside a multiline-comment
  • check if this line is inside paranthesis() or square-brackets []
  • If true repeat from 1
  • else report block.

So the final algorithm is my working solution as long as we are not able to find some problem in that as well. You can follow all the code  related to this algorithm in my PR.

Next steps:

Next steps are absolute indentation, hanging indents, keyword indents and an all new bear in the form of the LineLengthBear.

All of this looks really exciting, as i see my once planned Project come to life, i really hope all of this is useful someday and people actually use my code to solve their indentation problems :).

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s