Quantcast
Viewing latest article 9
Browse Latest Browse All 46

Answer by CristiFati for How do I check whether a file exists without exceptions?

Although almost every possible way has been listed in (at least one of) the existing answers (e.g. Python 3.4 specific stuff was added), I'll try to group everything together.

Note: every piece of Python standard library code that I'm going to post, belongs to version 3.5.3.

Problem statement:

  1. Check file (arguable: also folder ("special" file) ?) existence
  2. Don't use try / except / else / finally blocks

Possible solutions:

  1. [Python 3]: os.path.exists(path) (also check other function family members like os.path.isfile, os.path.isdir, os.path.lexists for slightly different behaviors)

    os.path.exists(path)

    Return True if path refers to an existing path or an open file descriptor. Returns False for broken symbolic links. On some platforms, this function may return False if permission is not granted to execute os.stat() on the requested file, even if the path physically exists.

    All good, but if following the import tree:

    • os.path - posixpath.py (ntpath.py)

      • genericpath.py, line ~#20+

        def exists(path):"""Test whether a path exists.  Returns False for broken symbolic links"""    try:        st = os.stat(path)    except os.error:        return False    return True

    it's just a try / except block around [Python 3]: os.stat(path, *, dir_fd=None, follow_symlinks=True). So, your code is try / except free, but lower in the framestack there's (at least) one such block. This also applies to other funcs (includingos.path.isfile).

    1.1. [Python 3]: Path.is_file()

    • It's a fancier (and more pythonic) way of handling paths, but
    • Under the hood, it does exactly the same thing (pathlib.py, line ~#1330):

      def is_file(self):"""    Whether this path is a regular file (also True for symlinks pointing    to regular files)."""    try:        return S_ISREG(self.stat().st_mode)    except OSError as e:        if e.errno not in (ENOENT, ENOTDIR):            raise        # Path doesn't exist or is a broken symlink        # (see https://bitbucket.org/pitrou/pathlib/issue/12/)        return False
  2. [Python 3]: With Statement Context Managers. Either:

    • Create one:

      class Swallow:  # Dummy example    swallowed_exceptions = (FileNotFoundError,)    def __enter__(self):        print("Entering...")    def __exit__(self, exc_type, exc_value, exc_traceback):        print("Exiting:", exc_type, exc_value, exc_traceback)        return exc_type in Swallow.swallowed_exceptions  # only swallow FileNotFoundError (not e.g. TypeError - if the user passes a wrong argument like None or float or ...)
      • And its usage - I'll replicate the os.path.isfile behavior (note that this is just for demonstrating purposes, do not attempt to write such code for production):

        import osimport statdef isfile_seaman(path):  # Dummy func    result = False    with Swallow():        result = stat.S_ISREG(os.stat(path).st_mode)    return result
    • Use [Python 3]: contextlib.suppress(*exceptions) - which was specifically designed for selectively suppressing exceptions


    But, they seem to be wrappers over try / except / else / finally blocks, as [Python 3]: The with statement states:

    This allows common try...except...finally usage patterns to be encapsulated for convenient reuse.

  3. Filesystem traversal functions (and search the results for matching item(s))


    Since these iterate over folders, (in most of the cases) they are inefficient for our problem (there are exceptions, like non wildcarded globbing - as @ShadowRanger pointed out), so I'm not going to insist on them. Not to mention that in some cases, filename processing might be required.

  4. [Python 3]: os.access(path, mode, *, dir_fd=None, effective_ids=False, follow_symlinks=True) whose behavior is close to os.path.exists (actually it's wider, mainly because of the 2nd argument)

    • user permissions might restrict the file "visibility" as the doc states:

      ...test if the invoking user has the specified access to path. mode should be F_OK to test the existence of path...

    os.access("/tmp", os.F_OK)

    Since I also work in C, I use this method as well because under the hood, it calls native APIs (again, via "${PYTHON_SRC_DIR}/Modules/posixmodule.c"), but it also opens a gate for possible user errors, and it's not as Pythonic as other variants. So, as @AaronHall rightly pointed out, don't use it unless you know what you're doing:

    Note: calling native APIs is also possible via [Python 3]: ctypes - A foreign function library for Python, but in most cases it's more complicated.

    (Win specific): Since vcruntime* (msvcr*) .dll exports a [MS.Docs]: _access, _waccess function family as well, here's an example:

    Python 3.5.3 (v3.5.3:1880cb95a742, Jan 16 2017, 16:02:32) [MSC v.1900 64 bit (AMD64)] on win32Type "help", "copyright", "credits" or "license" for more information.>>> import os, ctypes>>> ctypes.CDLL("msvcrt")._waccess(u"C:\\Windows\\System32\\cmd.exe", os.F_OK)0>>> ctypes.CDLL("msvcrt")._waccess(u"C:\\Windows\\System32\\cmd.exe.notexist", os.F_OK)-1

    Notes:

    • Although it's not a good practice, I'm using os.F_OK in the call, but that's just for clarity (its value is 0)
    • I'm using _waccess so that the same code works on Python3 and Python2 (in spite of unicode related differences between them)
    • Although this targets a very specific area, it was not mentioned in any of the previous answers


    The Lnx (Ubtu (16 x64)) counterpart as well:

    Python 3.5.2 (default, Nov 17 2016, 17:05:23)[GCC 5.4.0 20160609] on linuxType "help", "copyright", "credits" or "license" for more information.>>> import os, ctypes>>> ctypes.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp", os.F_OK)0>>> ctypes.CDLL("/lib/x86_64-linux-gnu/libc.so.6").access(b"/tmp.notexist", os.F_OK)-1

    Notes:

    • Instead hardcoding libc's path ("/lib/x86_64-linux-gnu/libc.so.6") which may (and most likely, will) vary across systems, None (or the empty string) can be passed to CDLL constructor (ctypes.CDLL(None).access(b"/tmp", os.F_OK)). According to [man7]: DLOPEN(3):

      If filename is NULL, then the returned handle is for the main program. When given to dlsym(), this handle causes a search for a symbol in the main program, followed by all shared objects loaded at program startup, and then all shared objects loaded by dlopen() with the flag RTLD_GLOBAL.

      • Main (current) program (python) is linked against libc, so its symbols (including access) will be loaded
      • This has to be handled with care, since functions like main, Py_Main and (all the) others are available; calling them could have disastrous effects (on the current program)
      • This doesn't also apply to Win (but that's not such a big deal, since msvcrt.dll is located in "%SystemRoot%\System32" which is in %PATH% by default). I wanted to take things further and replicate this behavior on Win (and submit a patch), but as it turns out, [MS.Docs]: GetProcAddress function only "sees"exported symbols, so unless someone declares the functions in the main executable as __declspec(dllexport) (why on Earth the regular person would do that?), the main program is loadable but pretty much unusable
  5. Install some third-party module with filesystem capabilities

    Most likely, will rely on one of the ways above (maybe with slight customizations).
    One example would be (again, Win specific) [GitHub]: mhammond/pywin32 - Python for Windows (pywin32) Extensions, which is a Python wrapper over WINAPIs.

    But, since this is more like a workaround, I'm stopping here.

  6. Another (lame) workaround (gainarie) is (as I like to call it,) the sysadmin approach: use Python as a wrapper to execute shell commands

    • Win:

      (py35x64_test) e:\Work\Dev\StackOverflow\q000082831>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" -c "import os; print(os.system('dir /b \"C:\\Windows\\System32\\cmd.exe\"> nul 2>&1'))"0(py35x64_test) e:\Work\Dev\StackOverflow\q000082831>"e:\Work\Dev\VEnvs\py35x64_test\Scripts\python.exe" -c "import os; print(os.system('dir /b \"C:\\Windows\\System32\\cmd.exe.notexist\"> nul 2>&1'))"1
    • Nix (Lnx (Ubtu)):

      [cfati@cfati-ubtu16x64-0:~]> python3 -c "import os; print(os.system('ls \"/tmp\"> /dev/null 2>&1'))"0[cfati@cfati-ubtu16x64-0:~]> python3 -c "import os; print(os.system('ls \"/tmp.notexist\"> /dev/null 2>&1'))"512

Bottom line:

  • Do use try / except / else / finally blocks, because they can prevent you running into a series of nasty problems. A counter-example that I can think of, is performance: such blocks are costly, so try not to place them in code that it's supposed to run hundreds of thousands times per second (but since (in most cases) it involves disk access, it won't be the case).

Final note(s):

  • I will try to keep it up to date, any suggestions are welcome, I will incorporate anything useful that will come up into the answer

Viewing latest article 9
Browse Latest Browse All 46

Trending Articles