Top 10 Coding Mistakes In Python & How To Avoid Them

339 VIEWS

·

Python is flexible, fun, and easy to learn, but as with any other programming language, there are some common errors that can give you a headache if you don’t understand Python’s quirks. In this article, we will help you save time in the future by introducing you to 10 of the most common errors that you can easily avoid.

NOTE: If you don’t have a recent copy of Python installed to try out the examples in this post, you can download one for free for Windows, Linux or macOS from the ActiveState Platform. Or better and more secure way of getting started is to create a Python environment with just the packages you need.

Scripts Without Main

Let’s start with a beginner’s mistake that can give you some issues if it isn’t detected promptly. Since Python is a scripting language, you can define functions that you can call in REPL mode. For example:

def super_fun():

   print("Hello, I will give you super powers")

super_fun()

This simple code will execute the super_fun function when it’s called from the CLI python no_main_func.py. But what happens when you want to reuse the code as a module in a notebook, for example?

coding mistakes python

As you can see, super_fun is executed automatically when you import your script. This is a harmless example, but imagine if your function performs some expensive computational operations, or invokes a process that spawns multiple threads. You wouldn’t want to run these automatically on import, so how can you prevent that from happening?

One way is to use a modified version of the simple standard __main__ script execution scope, as follows:

def super_fun():

   print("Hello, I will give you super powers")


if __name__ == "__main__":

   # execute only if run as a script

   super_fun()

With this code you get the same results as you did before when you invoked the script file from the CLI, but this time you won’t get unexpected behaviors when you import it as a module.

Float Data Types

Another common problem for beginners is float point management in Python. Newbies often incorrectly think of float as a simple type. For example, the following code shows the difference between comparing the identifier of simple types (like int numbers) versus float types:

>>> a = 10

>>> b = 10

>>>

>>> id(a) == id(b)

True

>>> c = 10.0

>>> d = 10.0

>>>

>>> id(c) == id(d)

False

>>> print(id(a), id(b))

9788896 9788896

>>> print(id(c), id(d))

140538411157584 140538410559728

Floating point types also have a subtle but important characteristic: their internal representation. The following code uses a simple arithmetic operation that should be simple to solve, but the comparison operator == gives you unexpected results:

>>> a = (0.3 * 3) + 0.1

>>> b = 1.0

>>> a == b

False

The reason for the unexpected result is that floating point operations can have slight (or even significant) differences due to their internal representation. The following function allows you to compare floats using the absolute value of the difference between them:

def super_fun(a:float, b:float):

   return True if abs(a-b) < 1e-9 else False


if __name__ == "__main__":

   # execute only if run as a script

   print(super_fun((0.3*3 + 0.1),1.0))

The Boolean Confusion

The definition of what should be considered a Boolean “true” value is a major source of confusion in several programming languages, and Python is no exception. Consider the following comparisons:

>>> 0 == False

True

>>> 0.0 == False

True

>>> [] == False

False

>>> {} == False

False

>>> set() == False

False

>>> bool(None) == False

True

>>> None == False

False

>>> None == True

False

As you can see, the zero value for any numeric data type is considered to be “false,” but empty collections like lists, sets, or dictionaries are not. Keep in mind that “none” is different from “true” and “false.” This can be problematic, because a variable can be undefined but later used in a comparison that will produce unexpected results.

Unstoppable Scripts

A slightly different type of problem occurs when beginners want to execute infinite loops inside their scripts, while preserving their ability to stop them. For instance, check out the following loop:

while True:

   try:

       print("Run Forest, run")

       time.sleep(5)

   except:

       print("Forrest cannot stop")


Run Forest, run

^CStop Forrest

Run Forest, run

^CStop Forrest

Run Forest, run

Run Forest, run

As you might have noticed, even the power of ^C is not enough to stop this loop. But why? That’s because the try/except block even catches the KeyboardInterrupt. If you want to catch only exceptions, you must be explicit about it:

while True:

   try:

       print("Run Forest, run")

       time.sleep(5)

   except Exception:

       print("Forrest cannot stop")


Run Forest, run

Run Forest, run

Run Forest, run

^CTraceback (most recent call last):

 File "with_main_func.py", line 6, in <module>

   time.sleep(5)

KeyboardInterrupt

But KeyboardInterrupt also inherits from BaseException, so you can catch it and manage it easily:

while True:

   try:

       print("Run Forest, run")

       time.sleep(5)

   except Exception:

       print("Forrest cannot stop")

   except KeyboardInterrupt:

       print("Ok, Forrest will stop now")

       exit(0)


Run Forest, run

Run Forest, run

^COk, Forrest will stop now

Module Name Clash

One question that surfaces from time to time in the discussion boards of popular programming sites concerns mistakes related to Python module names. For example, let’s imagine that you need to solve a complicated math problem, so you create a math.py script with the following code:

from math import abs

def complicated_calculation(a,b):

   return abs(a - b) > 0


if __name__ == "__main__":

  # execute only if run as a script

  complicated_calculation()

If you run this script, it will give you a stack trace like this:

$ python3 math.py

Traceback (most recent call last):

 File "math.py", line 1, in <module>

   from math import abs

ImportError: cannot import name 'abs' from 'math' (unknown location)

It should be obvious what’s going on: you named your own module in a way that clashes with one of the standard libraries. Sometimes it can be even trickier due to the dependency tree that is usually built for non-trivial projects.

To avoid these kinds of naming issues, you can check this resource for absolute and relative module path imports.

Mutable Function Arguments

Sometimes misused mutable types can result in unexpected behaviors. For example, consider the following snippet:

def list_init(alist=[]):

   alist.append('Initialize with a value')

   return alist


if __name__ == "__main__":

  # execute only if run as a script

  a = list_init()

  print(id(a), a)

  b = list_init()

  print(id(b), b)

  c = list_init()

  print(id(c), c)

You might think that the purpose of the list_init function is to add a new default value, so calling it without arguments should return a list of length 1. However, the results will surprise you:

$ python3 mutable_args.py

140082369213568 ['Initialize with a value']

140082369213568 ['Initialize with a value', 'Initialize with a value']

140082369213568 ['Initialize with a value', 'Initialize with a value', 'Initialize with a value']

What’s going on? The list type is mutable, and given that the default value of a function’s arguments is only evaluated at the time that the function is defined in Python, the empty list will be referenced in subsequent calls.

Luckily, a simple change can get you the desired behavior:

def list_init(alist=None):

   if not alist:

       alist = []


   alist.append('Initialize with a value')

   return alist


if __name__ == "__main__":

  # execute only if run as a script

  a = list_init()

  print(id(a), a)

  b = list_init()

  print(id(b), b)

  c = list_init()

  print(id(c), c)

List Mutation Inside Iteration

Many developers want to mutate a collection while iterating it. For example, imagine filtering a list manually, as follows:

list_nums = list(range(16))

for idx in range(len(list_nums)):

   n = list_nums[idx]

   if n % 3 == 0:

       del list_nums[idx]

print(list_nums)

This will produce a stack trace like this:

$ python3 list_mutation.py

Traceback (most recent call last):

 File "math.py", line 3, in <module>

   n = list_nums[idx]

IndexError: list index out of range

Fortunately, Python gives you several techniques for accomplishing something like this without writing much code. For instance, you could filter a list this way:

list_nums = list(range(16))

list_nums = [n for n in list_nums if n % 3 !=0]

print(list_nums)

List (along with dictionary comprehension) is one of the nicest ways to work with collections.

References, Copies, And Deep Copies

Many programmers struggle with the modification of variables that are supposed to be independent copies of other variables. This happens when you think you’ve created an independent copy of a variable, but actually you’ve just created a pointer to it.

For example, you can get a reference to a variable when you use the assignation operator = in the following way:

Python 3.8.10 (default, Jun  2 2021, 10:49:15)

[GCC 9.4.0] on linux

Type "help", "copyright", "credits" or "license" for more information.

>>> d1 = {'k':[1,2,3]}

>>> d2 = d1

>>> print('d1 and d2 points to the same object',id(d1), id(d2))

d1 and d2 points to the same object 140265839379264 140265839379264

As you can see, dictionaries d1 and d2 have the same id.

Now let’s try getting an independent copy of a variable using the copy() method:

>>> d2['k'].append(99)

>>> d3 = d1.copy()

>>> print('d3 is a different object but its contents points to d1 contents',id(d3), d1, d2, d3)

d3 is a different object but its contents points to d1 contents 140674662142848 {'k': [1, 2, 3, 99]} {'k': [1, 2, 3, 99]} {'k': [1, 2, 3, 99]}

In this case, you might think that modifying d2 wouldn’t affect d1, but actually both of them would be modified since they share the same reference. Strangely enough, d3 would also be modified, even though it’s a different reference. That’s because d3 is a shallow copy of d1, which means that its contents point to the same things.

To get a completely different instance of a variable with its own contents which you can then modify, you must use a deep copy:

>>> import copy

>>> d4 = copy.deepcopy(d1)

>>> d3['k'].append(199)

>>> print('d4 is a different object AND its contents is also independent',id(d4), d1, d2, d3, d4)

d4 is a different object AND its contents is also independent 139809259201088 {'k': [1, 2, 3, 99, 199]} {'k': [1, 2, 3, 99, 199]} {'k': [1, 2, 3, 99, 199]} {'k': [1, 2, 3, 99]}

Notice that the contents of d4 are not the same as the d1, d2, and d3 dictionaries.

Class Variables

Object-oriented programming aims to structure problems in a way that mimics the real world, but it can seem cumbersome to inexperienced programmers. One of the most common problems is understanding the difference between “class” and “instance” variables.

For example, consider the following code:

class Local:

   motto = 'Think globally'

   actions = []


instance1 = Local()

instance2 = Local()

instance2.motto = 'Act locally'

instance2.actions.append('Reuse')

print( instance1.motto, instance2.motto )

print( instance1.actions, instance2.actions )

The result might be unexpected:

$ python3 class_vars.py

Think globally Act locally

['Reuse'] ['Reuse']

You might expect the global attribute motto to be changed for all instances since that’s what happens with the list actions, but while actions is pointing to the same reference for all instances, the motto string is copied as an instance attribute once you alter it.

Generally speaking, class variables are usually unnecessary, and they’re normally discouraged.

Function Arguments By Reference And Value

Python has a peculiar way of using arguments in functions and methods. Programmers who are switching to Python from languages like Java or C++ might have some issues understanding the way that the interpreter works with arguments.

For example, consider the following example:

def mutate_args(a:str, b:int, c:float, d:list):

   a += " mutated"

   b += 1

   c += 1.0

   d.append('mutated')

   print('After mutation, inside func')

   print(id(a), id(b), id(c), id(d) )

a = "String"

b = 0

c = 0.0

d = ['String']


print( id(a), id(b), id(c), id(d) )

mutate_args(a,b,c,d)

print( a,b,c,d )

print( id(a), id(b), id(c), id(d) )

The output shows the ids for each variable. Notice that the variables inside the function are different from the original variables, except for the reference to the list.

In the following example, the referenced variable is mutated as expected, and the reference is kept inside the function:

$ python3 args.py

139733890106736 9788576 139733890064464 139733890004160

After mutation, inside func

139733890107376 9788608 139733891541072 139733890004160

String 0 0.0 ['String', 'mutated']

139733890106736 9788576 139733890064464 139733890004160

You should be careful to follow best practices, such as returning several values or using object attributes in order to avoid confusion when it comes to mutating function arguments. To get a deeper understanding of passing arguments by reference, check out this link.

Conclusion

Python is a high-level language that’s fun to work with once you understand some of its quirks. And it’s worth wrapping your head around those quirks since Python is wonderfully suited to solving many day-to-day programming scenarios quickly and easily.

But to avoid the headache of debugging code that seems okay at first glance, you have to understand Python’s design elements. This article provided an introduction to the top ten errors beginners often make, and gave you tips on how to avoid them. But there are lots of other resources that can help you get started, including:

Another problem with Python that has continued to only get messier over the years is package management. Package management is a hard problem in and of itself, and it’s been extended to encompass environment management. Despite the complexity, it’s essential to get it right.

Unfortunately, with Python, it’s historically been all too easy to get it wrong. This is one of the key reasons that Python features so many different package managers.

During development, sinking time into dependency hell in order to sort out problems with your environment is time wasted. As a result, many developers opt for a sophisticated solution that does as much heavy lifting as possible, leaving them free to focus on coding.

In the modern tech world, the secure way to code should be the easiest way to code. But introducing a dependency to your project inevitably brings along other dependencies, creating a domino effect that can make it challenging to spot security vulnerabilities.

Even when you find them, unless they’re critical vulnerabilities, the time and effort to resolve them means they’re rarely addressed, exposing your development and test environments to cyberattack.

If you want to eliminate dependency hell and create more secure code in dev and test without slowing down your sprint, I’d recommend a dependency manager that addresses the limitations of all the others. Take a look at the ActiveState Platform.


Nicolas Bohorquez is a Developer and Entrepreneur from Bogotá, Colombia, he has been involved with technology in several languages, teams, and projects in a variety of roles in Latin America and United States. Currently he is doing the Master in Data Science for Complex Economic Systems in Torino, Italy. Nicolas is a regular contributor at Fixate IO.


Discussion

Click on a tab to select how you'd like to leave your comment

Leave a Comment

Your email address will not be published.

Menu
Skip to toolbar