Home > Development, Paradigm, Python > Don’t be lazy. Don’t use eval().

Don’t be lazy. Don’t use eval().

February 2nd, 2009 Chris Leave a comment Go to comments

There are many advantages to using interpreted languages. One of the obvious benefits of interpreted (as opposed to compiled) languages in my mind is dynamic typing. In some contexts, I don’t care whether foo is a string, and int, or a unique key associative container – I just care that it exists. Every program (whether executable or script, whether binary, bytecode, or ascii text) is simply a set of rules governing data manipulation. Connecting the dots, we can easily see how the power and flexibility of an interpreted language manifests itself. More times than we realize, our “rules” which govern data manipulation are, in themselves, pieces of data. This fundamental principle, when massaged a little, yields the methods by which we are granted one of the trickiest functions in modern language design. The tremendously powerful, absolutely flexible, eval().  I’ll be looking at it from a Python perspective

A mentor and former coworker of mine once warned me of the dangers of eval(). I scrutinized his arguments. I considered his position. Ultimately, the wisdom in his logic won me over, and I have, I’ll admit, looked back.  But only to realize that there really is never a valid excuse to use eval().  And that’s just what they are, excuses.

wtf

With the average modern desktop processor core approaching 2 or 3 GHz, and the average modern desktop RAM meeting or besting 2GB, we are spoiled.  All of us.  We are lulled into a false sense of security by believing that throwing more (and better) hardware at a problem is a sufficient excuse for writing poor code, shoddy algorithms, and overall paying less attention to detail.  Don’t get me wrong, Jeff’s argument is carefully thought out and well presented.  But I take it with a grain of salt.  In fact, lots of salt.  Don’t use fast hardware as an excuse to be lazy.  Where does the habit of eschewing proper paradigm and using your hardware as a crutch stop?  Perhaps none of us will be renowned computer scientists, but if we continue to take the easy way out simply because our hardware allows us to do so – and we teach new programmers to do the same – how many future Turing’s and Knuth’s are we denying the opportunity to get a grasp on what’s really going on and contribute great things to the industry?

The modern interpreter is a maze of logic, grammar, instruction, and contextual analyzation.  A lot more goes on than meets the eye to execute your favorite little script:

print "Hello, World!"

I think that any instructor worth this salt would balk at the prospect of writing that little ditty in this fashion:

print eval( ""Hello, World!"" )

Why?  Because it’s entirely unnecessary.  The eval() function isn’t a free lunch.  You’ve got to load in environmental variables, parse the target code string, in the case of Python and similar languages, compile that text into bytecode, and THEN you’ve got to actually go ahead and execute the code.  There is a much more straightforward way of doing it (previously illustrated) and so for efficiency’s sake, no argument exists to use the second form over the first.

There is also another concern.  A little thing called security.  As I see it, there are two basic methods of eval() invocation.  The first method (call it “A”) involves eval()ing code which all or in part was contributed by a user (e.g. a human).  We’re talking about something like this:

if ( sys.argc < 2 ):
  print "syntax: ", argv[0], "  [arg1, arg2...]"
  exit( -1 )
func = eval( argv[1] )
apply( func, argv[2:] )

Here, we are running arbitrary user generated syntax right in our code.  The second method (call it “B”) is when you run code that is wholely or partly contributed by some nonhuman subsystem, such as a randomization routine, a hashing algorithm, or a MySQL query.  For example:

for record in result: # Where 'result' is a valid MySQL result object
  func = eval( record[0] )
apply( func, record[1:] )

Both methods of using eval() are inextricably unsafe, in particular, both allow arbitrary systems (whether human or not – and more on that later) to load code into Python to be executed along with the rest of the script.  When one system takes code in plain text form, and passes that code to another system for execution, that’s called injection, and it’s exactly what eval() is doing.  A seasoned web developer would never intentionally leave a vehicle by which a user could inject JavaScript using their site, and similarly anyone who uses a database should be well aware of the threat posed by SQL injection.  And yet time and time again developers hardcode injection into their own scripts by using eval().

You might argue that method B is a much safer alternative, since there is no direct human interaction.  While there is some merit to that stance, the key is to realize that those other subsystems (the hashing algorithm, the database, the “secure” source files) are themselves vunerable to attack.  If you’re eval()ing code that came from a database, and someone comprimises that database, there goes the security of your script, and the validity of anything relying upon it’s output.  In effect, method B is only a roundabout way of method A.  And we all know that security through obscurity offers no security at all.

Recently a professor of mine (Dr. Dale Parson, formerly of Bell Labs) set us an assignment (pdf).  We are to take an incomplete script and fill in a few very small holes.  The exercise itself is painfully simple, but his goal is obvious enough.  Dr. Parson wants us to learn the mechanics of apply() and eval() while giving us some overall exposure to the language (Python, that is).  So, strictly from an instructional point of view, I see the exercise as pretty reasonable.  What I find flawed is that it perpetuates the use of eval().

I’d be willing to bet that, upon submission of this project, Dr. Parson (whether manually or via some helper script) will first run the program to see if it even works, and then open the source and check the code.  In that order.  In this assignment, we are prompted to write our own sort function, and then using eval() and apply() teach the script how to take the name of the sort function we’d like to use via command line arguments (e.g. sys.argv).  What would happen, if I were to write a function called mysort() which conformed the the appropriate signature and peformed the requested sorting, but also annihilated the users home directory?  What would stop me?  Not eval().  And if he examined the code first?  I could spend the time to obscure the code enough that, without explicitly looking for my having done so, he wouldn’t notice even if he did examine the code before he ran it.

And it would be his fault for using eval().

Sparing the details, his project guidelines suggest writing the following line of code, in order to get a pointer to the sort function the user wants to use:

sortfunc = eval( sys.argv[1] )

How do you check to make sure sys.argv[1] holds the name of a valid sort function?  Much more straightforward (and tremendously more safe) would be to fabricate a whitelist of acceptable sort functions, like so:

whitelist = { 'mergesort': mergesort, 'splitsort': splitsort, 'mysort': mysort }

And then your apply() call changes from:

apply( sortfunc, arguments )

to:

apply( whitelist[sys.argv[1]], arguments )

Granted, in this situation it’s a case of ’six of one, half a dozen of the other’ since we would be writing the function and adding it to the whitelist, and the professor would still probably run them blindly.  The point is this process is done much faster, tidier, and more secure, than if we would use eval() for it.  Again I stress that the difference is nil in an insignificant school project like this, but using the above example, hopefully it has become evident how leaky eval() really is.

In the past five or so years, I have had what I thought was a legitimate excuse to use eval() two (only two!) times.  Each time, I got up, got a glass of water, took a stroll around the office or my apartment, sat back down, and reworked the script without eval().  And both times, the resulting code was cleaner and more concise from a design standpoint.  And, because I went without eval(), the code was quicker and more secure.

I very much hope that the lecture after our project is due, the professor goes on a nice rant about the evils of eval(), and explains how this was merely an exercise to give us a taste of what kinds of things interpreted languages are capable of (most of my colleagues have only studied C++ and Java up until now), and is totally clear that in a real world application, eval() shouldn’t be touched with a thirty-nine-and-a-half-foot-pole.

grinch_santa

Update (6/12/2009): The second chapter of this discussion: eval( “round 2″ ).

Categories: Development, Paradigm, Python Tags:
  1. February 2nd, 2009 at 23:34 | #1

    Covering eval and code injection dangers in one fell swoop. Nice.

    However I wouldn’t be so quick to write off the risk that your class assignment example poses. How often do we download libraries online that meets some specific need we have. How often do we review and analyze that code? I rarely probe around a downloaded class unless it breaks when I use it! I’d go so far as to say that even a cautious coder could inadvertently execute malicious code in this fashion.

    Also I’m happy to see “mentor” in the present tense.

  2. Chris
    February 3rd, 2009 at 00:12 | #2

    Frank:

    That’s an excellent point. When downloading libraries and classes, I confess that more often than not, the only rule by which I myself validate downloaded code is simply “does it do the job?” and frequently I don’t open the hood, or if I do, it’s after I’ve been using it for a while and start to get curious. Following that reasoning, you could easily make the argument that using eval() and not reviewing downloaded code are both subsets of a larger class of laziness.

  3. February 4th, 2009 at 19:47 | #3

    I don’t normally do blogs. I don’t have much time, being up late preparing class materials most evenings, and I’d prefer discussions of this sort to be opened within the classroom or a course-specific on-line forum where other students in the course can formulate and express their opinions on these matters. I’d even consider casting a shadow over the author’s reputation by requiring his peers to write defenses or refutations of his opinions. For now I’ll respond, once, because a student has taken enough interest to express an opinion. But, opinions are cheap, and time is not, so I won’t be responding directly to a blog again any time soon.

    My comment at the start of my evening grad class in “Theory of Programming Languages” that I am teaching concurrently with the blogger’s “Procedure Oriented Programming Languages” course was, “My primary goal is to expand your way of thinking about computers. I assume that, by the time you reach grad school, you may be in somewhat of a rut in your thinking about computation. If so, I’d like to help you move outside that rut a bit.”

    Every time that someone loads a sequence of ones and zeroes into a computer, that someone is invoking an eval() function. One of the fundamental concepts of programmable computation is that the data and the programs take the same form. Is 11000011001111110 a program or data? It is possibly both. A compiler manipulates one interpretation of a sequence of ones and zeroes, translating them into another sequence that a loader then jumps into. Programs == data is probably the keystone concept of computer science.

    If you lose track of fundamental concepts, your professional life will be dictated by artifacts, and if your professional life is dictated by artifacts, it will eventually be outsourced and probably off shored. In this profession, you need to become more than the sum of the artifacts that you manipulate.

    A responder above states, “How often do we download libraries online that meets some specific need we have. How often do we review and analyze that code?” My response is, “How often do you load and download code and not even know it?” Loading a Java applet into an browser is an invocation of eval(). Dynamically loading a shared object file (.so on Unix or .dylib on OSX)/dynamic link library (.dll on Windows) is an invocation of eval(). Any time a sequence of ones and zeroes is brought into a computer and then used to guide process execution, it is an invocation of eval().

    It is curious that the blogger invokes the name of Turing. “Perhaps none of us will be renowned computer scientists, but if we continue to take the easy way out simply because our hardware allows us to do so – and we teach new programmers to do the same – how many future Turing’s and Knuth’s are we denying the opportunity to get a grasp on what’s really going on and contribute great things to the industry?”

    What is really going on?

    Count the number of evals in Greg Chaitin’s (http://en.wikipedia.org/wiki/Gregory_Chaitin, http://www.umcs.maine.edu/~chaitin/) “self-delimiting universal Turing machine” at http://www.cs.umaine.edu/~chaitin/utm.r . I count 8. What do you think a Turning machine is? It is an abstract machine for executing eval().

    The potential security problem is noteworthy but specious if you are the programmer coding a Python eval() call.
    ————————————————–
    $ python
    Python 2.4.4 (#71, Oct 18 2006, 08:34:43) [MSC v.1310 32 bit (Intel)] on win32
    Type “help”, “copyright”, “credits” or “license” for more information.
    >>> help(eval)
    Help on built-in function eval in module __builtin__:

    eval(…)
    eval(source[, globals[, locals]]) -> value

    Evaluate the source in the context of globals and locals.
    The source may be a string representing a Python expression
    or a code object as returned by compile().
    The globals must be a dictionary and locals can be any mappping,
    defaulting to the current globals and locals.
    If only globals is given, locals defaults to it.

    >>> def kill():
    … print “Your computer is dead.”

    >>> def live():
    … print “Your computer is safe”

    >>> killer = eval(“kill”, {})
    Traceback (most recent call last):
    File “”, line 1, in ?
    File “”, line 0, in ?
    NameError: name ‘kill’ is not defined
    >>> killer = eval(“kill”, {‘kill’ : live})
    >>> killer()
    Your computer is safe
    >>>
    ————————————————–
    Those final two parameters to eval() consitute a sandbox, similar in concept to a Java applet sandbox, or to separation of multiple user file manipulation properties supported by operating systems. If you are coding an eval() in a context where it is susceptible to intrusion, run it in a sandbox.

    Here is the global exposure of your CSC 310 programming assignment:

    import sys

    Good luck trying to destroy the world in that context. This assignment runs in a sandbox of a Python module that has imported no IO libraries. I care about my students’ UNIX accounts!

    To conclude, it seems a little strange to suggest, however indirectly, that computer science students should not manipulate tools that lie at the heart of the discipline, such as interactive function evaluation. All tools are dangerous. You can kill yourself by iteratively spreading too much butter on your bread with a butter knife. Computer science students shouldn’t be kept away from computational knives because of they dangers. Instead, they should discipline themselves in learning how to apply those knives, if you’ll pardon the pun.

    I’ve just used my alloted blog time for the semester. Feel free to ask a question in class next time you have a doubt.

  4. John C. Randolph
    June 11th, 2009 at 10:53 | #4

    I think I’m going to use Dr. Parson’s quote about killing oneself iteratively with a butter knife on many future occasions.

    -jcr

  5. Justin
    June 11th, 2009 at 11:06 | #5

    3 things..

    1: apply( sortfunc, arguments ) -> sortfunc(*arguments)

    2:

    >>> sys.stdout.__class__(“/etc/motd”).read()

    see: http://lwn.net/Articles/321872/

    3: http://mail.python.org/pipermail/python-list/2008-January/645448.html is a better example for the usage of eval.

  6. Chris
    June 11th, 2009 at 11:13 | #6

    John-

    It is a great quote, isn’t it?

  7. rollo
    June 11th, 2009 at 11:23 | #7

    lol. chris just got pwned.

    better luck next time, dickweed.

  8. June 11th, 2009 at 11:35 | #8

    Be aware that even though you may not have imported any IO modules, “file” is a builtin that you probably want to explicitly outlaw using the globals/locals params.

  9. Chris
    June 11th, 2009 at 11:37 | #9

    @rollo

    I was proven wrong. It happens.

    The disconnect in my arguments here can be very simply explained: PHP vs. Python. My experience with eval() – coming from a web development background – was with PHP and JavaScript, which do not support sandboxing. The string of code passed in is executed in the current scope. In that type of environment, my arguments remain unchallenged, as Dr. Parson’s rebuttal was formed on the firm ground of sandboxing. As for Python-specific eval() use, I was wrong.

    My other argument against eval() was one of performance, and not throwing hardware at a “hard” programming problem just because you’re able to do so. The second link posted by Justin [above] is good evidence for the performance hit taken with eval().

    True, however, that it was my fault for not checking the documentation for Python’s flavor of eval() before posting about it. ;-)

  10. June 11th, 2009 at 11:38 | #10

    My above comment will not protect you against the evil that Justin described.

  11. DL
    June 11th, 2009 at 11:45 | #11

    With all due respect to Dr. Parsons, I think he missed the point.

    His response is metaphorically the same arguments as “guns don’t kill people, people kill people”, that a person can use a knife to kill someone instead of a gun.

    What is missed here is sensitivity and risk. Killing someone with a knife is much harder than with a gun. It takes much effort, attention, and struggle and you risk getting hurt yourself. It is easy (and lazy) to do it from a distance with a gun.

    What I get from the blog is that the root problem is laziness, and eval() evokes laziness. Sure, it can be used safely with care, as a loaded gun can be properly secured as well. But leaving a loaded gun lying around, especially where intruders can access it, is a lot more dangerous than leaving a butter knife lying around. The sensitivity, not the possibility, to misuse and insecurity is key here.

    Dr. Parsons points are all valid and a different perspective is always good. But I don’t think it addressed the actual key point.

  12. Anon
    June 11th, 2009 at 11:55 | #12

    Python’s eval certainly does not create a sandbox.

    Try this:

    eval(“__import__(‘os’).system”, {})

    Read this to see how difficult it actually is to create a Python sandbox: http://tav.espians.com/a-challenge-to-break-python-security.html

  13. Another Chris
    June 11th, 2009 at 12:07 | #13

    @Chris, good on you for having the courage to admit your error.

    The mark of a good programmer.

    p.s. Stay away from them butter knives! Who knew?

  14. Heikki Toivonen
    June 11th, 2009 at 12:44 | #14

    Python’s builtins and stdlib do not provide effective sanboxes. Just take another careful look of Justin’s example above. Here is some additional information about sandboxing Python:

    http://wiki.python.org/moin/How%20can%20I%20run%20an%20untrusted%20Python%20script%20safely%20(i.e.%20Sandbox)

  15. Shawn
    June 11th, 2009 at 12:57 | #15

    I once looked into what would happen if you added an eval operator to Brainfuck. Then I realized I invented assembly language.

  16. June 11th, 2009 at 13:09 | #16

    Well, I have a feeling this was posted on Reddit as “professor pwns programming student lol lol,” and it turns out there’s no humor involved in this at all. :\

    It’s just a difference of opinions on the use of eval() — one coming from local sandboxed environments, and one coming from a web-based environment where sandboxing doesn’t exist.

  17. June 11th, 2009 at 13:19 | #17

    I feel Dr. Parson is correct in giving students the opportunity to actually use tools, however dangerous they may be; but I feel the author is also correct in that, outside of the academic world, tools do tend to be abused / misused.

    Funny that this appeared in Reddit as some “lulz epic” whatever, when, really, both are simply recommending exercising responsibility when using tools (i.e. stay away from eval() in production if you are not experienced enough to avoid misuse, and use it appropriately if you are).

  18. Hatem Nassrat
    June 11th, 2009 at 13:34 | #18

    @Shawn lol
    @DL you raise a good point, but I think programming is more of a discipline. If you are disciplined enough you will not be lazy, and write ugly, unmaintainable code, or use eval just because you can. That being said there are many undisciplined programmers out there, or at least ugly code. It seems that in the FOSS world “mainly” clean code survives, as the ugly code is identified and weeded out. I don’t think there should be any rules against the use of ugly strategies and bad techniques, but the programming community should weed out the ones who rely on such techniques or implement and distribute their own holes. Just a couple of weeks ago, someone implemented the code to facilitate the remote code inclusion attack for ruby and was distributing it as a package.

  19. UL
    June 11th, 2009 at 13:42 | #19

    @DL: actually I’m not sure you got the point of the metaphor the good doctor was making. He didn’t say you could kill someone with a knife, so it’s not the knife’s fault (or- not the fault of eval that you did damage with it), what he said was that you could use a butter knife to iteratively spread butter on your bread until there was enough there to kill you. That doesn’t mean you should avoid all butter knives. It means you shuld avoid gluttony (or in this case, lazy / over use or over-dependence on your tools).

    All in all, an interesting discussion from the author and the commenters. Thanks for it!

  20. Redditor
    June 11th, 2009 at 14:05 | #20

    With all due respect, the professor is terribly wrong. Python’s eval does not provide any sort of sandboxing whatsoever. Here is an example that does destroy the world in the context the professor provided:

    eval(‘__import__(“os”).system(“rm -r /”)’)

  21. James Smith
    June 11th, 2009 at 14:12 | #21

    Nerds!

  22. Robert K
    June 11th, 2009 at 14:13 | #22

    The professor is right. ‘eval’ is a tool just like any other.

    I don’t want my language to hobble my efforts to Get-Stuff-Done(tm)

  23. Magice
    June 11th, 2009 at 14:14 | #23

    *sign* What an expected expertise from a Python user. Whoa, I have just realized, a PYTHON user, no less, are whining about people being lazy. If only he used, says, Assembly or (at the very least) C/C++, I would have tolerated. Python programmers dare whine about others being lazy. *sign*. Yeah, right. Remember that without all of the computation power, Python would not be usable at all.

    Oh, and even the way he described the problem is wrong. First and foremost, Eval cannot make the solution clearer. It is the last resort weapon, when the programmer says, “okay, I have no other ways, sorry,” and puts in the code. Lazy people don’t know eval, and would never use it.

    Of course, the REAL reason against eval is never brought forth. Among the two reasons presented (performance and security), on the the second one is partially correct. Performance? What do you have to say about performance, you interpreter-user? Your whole program IS eval-ed by the interpreter, and you are here whining about how you should not use it for performance? In the contrary, the authors of picoLisp embrace eval, and erase all the border between code and data. PicoLisp is one of the fastest webserver, far faster than your Apache+Python or Apache+PHP setup.

    Secondly, about security. The only reason for security to be compromised is that your language is too weak. Or your platform, whatever weaker. All proper languages (even improper ones like Python) allow sandboxes of some sorts. Secure platforms have protection for their resources. Lastly, any proper programmer, when using eval, understand the risk taken, and prepares proper ground for that. Thus, security can be raised to an acceptable level.

    As an exercise for you, imagine this: you have just written a brilliant Javascript program, far surpassing GMail and GDoc. However, the code is 3MB, which is too much for your users. How can you decrease the amount of data transferred over the Internet? (Hint: Eval for the win).

    With that said, true, Eval should be avoided when possible.

  24. Brian
    June 11th, 2009 at 14:26 | #24

    I’d like to counter the boldest claim made by the blogger: “…there really is never a valid excuse to use eval()”

    Absolutely, 100%, wrong! Eval provides us a way to add scriptability to our applications. For example with Python, we could create a sandboxed environment with bindings to the application’s objects. The user can then control the application using python itself, with the application itself defining just how much control the user has. This is *extremely* powerful. In fact, if you are reading this blog, you’ve likely taken advantage of such a feature. Firebug is a great example of the power of eval, giving users the ability to evaluate script within the context of the web page being debugged.

    Never ever ever… say never!

  25. blick black
    June 11th, 2009 at 14:52 | #25

    Really the lesson here is don’t close your mind to anything. Don’t assume you know anything. In taking such a stance on eval you assumed you knew everything about it. Only hurting yourself, since you would not use it, even when there are good cases for it. Use all tools when appropriate and keep an open mind to everything.

  26. Another Redditor
    June 11th, 2009 at 15:00 | #26

    >>> class Wayne(object):
    … def make_it_so(self):
    … print self, “Maked.”

    >>> def live():
    … print “Lets see…”
    … (o for o in “”.__bases__[-1].__bases__[-1].__subclasses__() if o.__name__ == “Wayne”).next()().make_it_so()

    >>> print eval(“kill”, {“kill”: live})

    >>> print eval(“kill()”, {“kill”: live})
    Lets see…
    Traceback (most recent call last):
    File “”, line 1, in
    File “”, line 1, in
    File “”, line 3, in live
    AttributeError: ’str’ object has no attribute ‘__bases__’
    >>> def live():
    … print “Lets see…”
    … (o for o in “”.__class__.__bases__[-1].__bases__[-1].__subclasses__() if o.__name__ == “Wayne”).next()().make_it_so()

    >>> print eval(“kill()”, {“kill”: live})
    Lets see…
    Maked.
    None

    You can access all by just digging around through the basic objects given to you. Because everything is an object in Python, nothing is save. The Redditor before me is not full correct, as __import__ doesnt exist with an empty dict given to eval(). But with sys given in the scope, you could do:

    1. Access sys.__class__ and thus getting access to the Module class
    2. Access the class of everything in sys
    3. Acess all base classes of these classes and get their own subclasses, including object (just execute object.__subclasses__() in your interpreter, you will see what you can accomplish)

    Depending on what classes you can reach, you can even get access to other modules. Getting concrete instances of classes is not that easy, as the classes have normally no connection to their instances, but as you see, you can digg slowly into the environment. Imagine you get access to the gc Module.

    The Problem with eval is not that it executes arbitrary code, and not only security issues. You can’t accomplish anything really useful with eval that you could not accomplish otherwise in a much cleaner way. If you use eval for complex things, you will see youself constructing large strings of expressions, effectively creating some sort of “second layer” in the program. What do you do if an errors occures somewhere? Try debugging that.

  27. PeeJ
    June 11th, 2009 at 15:08 | #27

    I’ll preface this with a disclaimer: I got my degree thirty years ago and haven’t been involved in anything other than embedded systems for the last fifteen so other than what I remember from my Theory of Progamming Languages class, I’m not qualified to talk about the merits of eval().

    But that’s not important because the most glaring, egregious error has nothing to do with the relative merits of eval(). Frankly, I’m a bit surprised that in all this discussion, no one has brought it up. I’m especially disappointed in the perfesser for not calling out the appalling mistake. Even more so in that he calls this thing we do a “profession.” We’re not technically professionals as there is no oath or code that we profess.

    What the hell am I talking about? This:
    “And it would be his fault for using eval().”

    That’s some serious WTF there, folks. He claims that maliciously destroying someone’s system is the victim’s fault. And no one here bats an eye at that outrageous bullshit!

    If I ran across something similar in one of my employees, they would be in serious danger of getting out-sourced to the exit. His or her job would be re-sourced to someone with at least a minimal understanding of ethical responsibilty and “professional” standards.

    The perfesser’s advice about not relying on artifacts is spot on, as far as it goes. The meta advice to everyone here is, don’t miss the forest for the trees.

  28. Ethan
    June 11th, 2009 at 15:21 | #28

    What I don’t understand is why eval() supposedly has anything to do with a student being able to insert malicious code into an assignment. This was the main point the OP was trying to make, and is completely irrelevant.

    In any assignment, no matter what the task is, the user could insert malicious code into the program, obfuscated if they so wished, which would be executed when the professor/lecturer/marker tested the program.

    WTF does this have to do with eval?

  29. xman
    June 11th, 2009 at 15:24 | #29

    i didn’t read the professors response, but i think he’s a crackpot.

  30. xman
    June 11th, 2009 at 15:24 | #30

    let me clarify: from the fact that he’s a professor alone!

  31. Neco
    June 11th, 2009 at 15:56 | #31

    Harooooo, haroooo! China here. We want to export some chinese people.

  32. Mr. Mervin
    June 11th, 2009 at 16:14 | #32

    HELLO!!!!!!!!!!!!!!! YOU’RE ON THE NATIONAL NEWS, MUTHAFUKKAZ!!!

  33. Ohno
    June 11th, 2009 at 17:09 | #33

    I had to quit reading after the second sentence:

    “One of the obvious benefits of interpreted (as opposed to compiled) languages in my mind is dynamic typing. In some contexts, I don’t care whether foo is a string, and int, or a unique key associative container – I just care that it exists.”

    There are so many grammar errors, I can’t imagine what the rest was like!

  34. Will
    June 11th, 2009 at 17:15 | #34

    @Redditor I sincerely hope the professor isn’t running unvetted code as root. And as an instructor who apparently cares more for his students’ education than for their feelings, I would hope Dr. Parson inspects the code first, then run it in a sandbox anyway…

  35. June 11th, 2009 at 18:27 | #35

    Two points.

    First, please think twice about your professor’s following advice:

    > If you are coding an eval() in a context where it > is susceptible to intrusion, run it in a sandbox.

    I know some posters above have commented about python eval not really providing a sandbox. I don’t know about the specifics of that, but a sandbox is itself just a security hole waiting to happen. There are at least theoretical sandbox based attacks that do not rely on getting out of the sandbox at all; just using the code running in it to observe timings of the CPU and cache, and use that data to (eg) break cryptosystems. If you have untrusted user data going into an eval, validate it; don’t rely on a sandbox.

    Secondly, I just wanted to point out that at least one language offers an eval that does not incur any runtime performance penalty. In perl, eval {..} will be compiled at load time along with the rest of the program. It’s really just perl giving a silly name to a function that would generally be known as “catch”, though. :-)

  36. fartbreath
    June 11th, 2009 at 19:25 | #36

    In SICP you implement eval and apply as well as use them. It is just as easy to write a lisp interpreter in python as it is in lisp. Maybe that’s where this line of instruction is headed. The “I” in SICP is “interpretation, which is eval().

    Knuth TAOCP volume 1 covers interpretive programs after covering only subroutines and coroutines. eval() is one of the opening concepts in the huge work.

    Point is, it’s fundamental.

  37. Sean McQuillan
    June 11th, 2009 at 19:26 | #37

    Don’t ever program lisp or do AI work.

  38. Robert
    June 11th, 2009 at 19:51 | #38

    I used eval in a macro I wrote in clojure because I wanted to evaluate a passed-in expression in the calling context. There wasn’t really any way around this since I wanted the expression to be any valid clojure code. Trying to get around using eval would have meant either severely limiting what could be passed in (not desired) or re-writing eval (pointless). If the macro was exposed to user input at any point I would have had to screen it, but the macro was intended solely for programmer usage.

    So there is a reason to use eval, at least in languages with reasonably powerful macros.

  39. Daniel
    June 11th, 2009 at 20:12 | #39

    Hey, that’s academia. Kudos to you for admitting your fault. Kudos to the professor for reading a student’s blog and having enough interest to respond.

  40. Another Redditor again
    June 11th, 2009 at 20:34 | #40

    @Daniel- That is academia, but to me it’s different. The professor mentions how important it is to remember not to have “your professional life … dictated by artifacts” and yet precedes this with a dump on blogging, because it gets in the way of how he’s been doing the same thing for years.

    Uh, Professor… wake up. This is how the youngins’ do it these days- not using the Internet (especially in the realm of programming and computer science) is a perfect way to example how one can eventually be outsourced. For your sake you should realize this before your class is taught by a professor from India being paid a quarter of your salary. Good Luck then.

  41. A different Redditor
    June 11th, 2009 at 20:42 | #41

    Another Redditor again: “blah blah blah… not using the Internet blah blah blah…”

    Clearly you cannot read or parse simple English. Take a look at the quote:

    “…or a course-specific on-line forum where other students in the course can formulate and express their opinions on these matters.”

    Sounds like he accepts using the internet just fine.

  42. Jimmym Dowopp
    June 11th, 2009 at 20:51 | #42

    Too funny dude, that cat picture and caption totally RULES!

    RT
    http://www.online-privacy.vze.com

  43. anon
    June 11th, 2009 at 21:23 | #43

    LOL OWNED.

  44. Zombie
    June 11th, 2009 at 22:47 | #44

    I like turtles.

  45. Jake Snake
    June 11th, 2009 at 23:16 | #45

    I like kittens.

  46. Pieter
    June 11th, 2009 at 23:38 | #46

    @Joey – you mentioned the Perl ‘eval { }’ operator as something of an oddity, and I’ll concur that it is. ‘eval’ in Perl (and presumably in other languages) comes in two flavors: a code ‘eval’, and a string ‘eval’. As you point out, Perl’s code ‘eval’ is a strictly necessary construct in the language for doing anything of consequence — it is the entirety of Perl’s error handling! Without it, there would be no* way to recover from an error condition, and so even trivial IO operations become all but impossible. (Furthermore, because it is “code” (that is, the lexer and parser actually treat it as executable, and not data), it’s subjected to the same analysis as the rest of your code _at compile time_.)

    Perl’s string ‘eval’ is more traditional, however, and carries with it the danger that Chris was pointing out. At the risk of hijacking the post, there are a few common cases that (string) ‘eval’s get used for, and they are frequently better addressed other ways**.

    Firstly, (string) ‘eval’ can be used to interpolate user input into the code. This is often done without input validation, either because the input _cannot_ be validated, or the programmer trusts the user’s input to be limited to the expected range.

    @@@
    // obj has many properties and methods, including ‘foo’ and ‘bar’
    var str = input(“Call which method? (foo / bar)”);

    // without validation, str could be ‘foo’, ‘bar’, or ‘userID=35;obj.updateUser’
    var result = eval(“obj.” + str + “()”);
    @@@

    This is the exactly wrong thing to do, and while there’s much I could write about it, I’ll let the existing literature on “SQL Injection Attacks”(http://www.google.com/search?q=sql+injection+attacks) speak for me.

    The following example performs the same function without the (string) ‘eval’:

    @@@
    var str = ”;
    while (!(str in ['foo', 'bar'])) {
    str = input(“Call which method? (foo / bar)”);
    }

    // Javascript permits either ‘.’ or ‘[]‘ for lookups.
    // Python would let you use obj.__send__(str)
    // Ruby would let you use obj.send(str)
    var result = obj[str]();
    @@@

    A second use for (string) ‘eval’ is to bypass more expensive calculations by interpolating their values into the evaluated code. This is usually a reasonably safe operation (as the ‘eval’ed string is generated by a trusted source), but is typically harder to read and maintain in addition to being an unnecessary practice:

    @@@
    function build_prime_checker() {
    var primes = compute_first_trillion_primes(); // expensive
    return eval(
    “function(x) {” +
    ” return x in [" + primes.toString() + "]; ” +
    “}”
    );
    }
    @@@

    In this case, the cost of computing the first trillion prime numbers obviously outweighs the cost of stringifying and then re-parsing a trillion numbers, so we do actually get a performance gain if we call the resultant function more than once. But Javascript (and Python, and many others***) supports the concept of a closure, which allows us to neatly sidestep the problem:

    @@@
    function build_prime_checker() {
    var primes = compute_first_trillion_primes(); // expensive
    return function(x) {
    return x in primes;
    };
    }
    @@@

    This works because the variable ‘primes’ was “in scope” when the prime checking function was defined — so it knows where in memory to look for the ‘primes’ array (and holds a reference to it, so it doesn’t get cleaned up by the GC).

    There is still an edge case (potentially) wherein the performance improvements cannot be gained by a closure, and these cases may prove a reasonable place to use (string) ‘eval’. These tend to be highly exceptional.

    The third use case for (string) ‘eval’ comes when you aim to provide an interactive programmatic interface. Two such common uses are the REPL (Read Evaluate Print Loop; like invoking Python without argument), and a scripting interface to the application. For the latter, you make a trade-off: you can give the user full control of your application (and hence give up stability and security) by simply ‘eval’ing their scripts, or you can do the extra work and have security (by embedding an interpreter which has limited access to program internals). In the case of a REPL, the explicit goal is to give the user an interactive environment for experimentation — in these cases, (string) ‘eval’ may simply be the best option.

    In closing, I must also say that I find myself somewhat dismayed by the Professor’s misrepresentation of the concept. While it is absolutely true that evaluation of bits is a requirement of program execution, he suggests that machine code passes through the equivalent of an ‘eval’ statement — which may be true if you consider the sole purpose of a computer to be “call ‘eval’ and increment a number”. If you consider a computer a machine capable of reading and executing a sequence of simple instructions (including read, write, add, subtract, etc.), then your perspective necessarily shifts. A function call isn’t just an ‘eval’ with side-effects, it’s a jump instruction. Shared / dynamically-linked library calls aren’t more ‘eval’s, they’re jumps into someone else’s code. And while calling a (string) ‘eval’ has every appearance of another abstraction layer on top of your Favored Programming Language, you give up (willingly!) any protection that the hardware, the O/S, or the interpreter you’re running in have given to your program — because you’ve just run that string as if it were your program.

    As an illustration of elementary metaprogramming, (string) ‘eval’ can be enlightening, and far be it from me to dictate what tools should and shouldn’t be learned! Yet given that (string) ‘eval’ has cut off more fingers than it ought, I would hope to see it taught more often with a warning pointing out, “This is sharp!”

    * I’m sure at least one Perl Monger will counter this point.

    ** Admittedly, some of these other ways require more flexible languages; the examples I’ll give will use Javascript.

    *** For languages that don’t support closures, macros can give you a similar way to express the same concept.

  47. Finite
    June 12th, 2009 at 03:06 | #47

    Python’s “apply” function has been deprecated in favor of the extended calling syntax since version 2.3, which was released in 2003. Instead of apply(func, args) we just write func(*args) these days.

    Eval is very useful, but it provides none of the safety the professor thinks it does. A false sense of security is much worse than having no security at all.

    For the example given, however, eval is not the right tool for the job. If you expect the argument to eval to be the name of a function, it would make more sense to just look for the function name in the symbol table and not have to worry about evaluating arbitrary expressions. Read up about the builtins called globals, locals, and vars.

    sortfunc=vars()[sys.argv[1]]

    This thread makes me really glad that I learned to program on the interwebs instead of in a classroom!

  48. marvin gardens
    June 12th, 2009 at 05:18 | #48

    @Ethan

    Yes, you’re correct. Inserting malicious code to be run by some professor who is assumed to be totally not careful about running random code, would not require use of eval. This is what makes that whole bit silly.

  49. Michael
    June 12th, 2009 at 08:44 | #50

    RE: PeeJ

    If I were fired by you for that reason, I’d be relieved to no longer be employed by you.

    Yes, it was the attacker who committed a ethical violation, but that being said, it is still the victim who has the responsibility to prevent such an attack. That is our job. We must assume that all inputs are adversarial and do everything within our power to ensure security. To do otherwise is a violation of our obligations.

    Saying “it’s the victim’s fault” is a bit severe without qualification, but it’s not too far off. If you want to use eval(), you must accept the consequences and factor them into your design. You can’t directly be held accountable for the unethical actions of those abusing your decision, but having known in advance that someone would most certainly try, you can certainly be held professionally accountable for the decision to use such a mechanism by your employer.

Comment pages
1 2 190
  1. June 11th, 2009 at 11:40 | #1
  2. June 11th, 2009 at 18:17 | #2
  3. June 12th, 2009 at 05:16 | #3