
Descriptor Remote
*****************

Module for remotely retrieving descriptors from directory authorities
and mirrors. This is most easily done through the
"DescriptorDownloader" class, which issues "Query" instances to get
you the descriptor content. For example...

   from stem.descriptor.remote import DescriptorDownloader

   downloader = DescriptorDownloader(
     use_mirrors = True,
     timeout = 10,
   )

   query = downloader.get_server_descriptors()

   print 'Exit Relays:'

   try:
     for desc in query.run():
       if desc.exit_policy.is_exiting_allowed():
         print '  %s (%s)' % (desc.nickname, desc.fingerprint)

     print
     print 'Query took %0.2f seconds' % query.runtime
   except Exception as exc:
     print 'Unable to retrieve the server descriptors: %s' % exc

If you don't care about errors then you can also simply iterate over
the query itself...

   for desc in downloader.get_server_descriptors():
     if desc.exit_policy.is_exiting_allowed():
       print '  %s (%s)' % (desc.nickname, desc.fingerprint)

   get_authorities - Provides tor directory information.

   DirectoryAuthority - Information about a tor directory authority.

   Query - Asynchronous request to download tor descriptors
     |- start - issues the query if it isn't already running
     +- run - blocks until the request is finished and provides the results

   DescriptorDownloader - Configurable class for issuing queries
     |- use_directory_mirrors - use directory mirrors to download future descriptors
     |- get_server_descriptors - provides present server descriptors
     |- get_extrainfo_descriptors - provides present extrainfo descriptors
     |- get_microdescriptors - provides present microdescriptors
     |- get_consensus - provides the present consensus or router status entries
     |- get_key_certificates - provides present authority key certificates
     +- query - request an arbitrary descriptor resource

New in version 1.1.0.

stem.descriptor.remote.MAX_FINGERPRINTS

   Maximum number of descriptors that can requested at a time by their
   fingerprints.

stem.descriptor.remote.MAX_MICRODESCRIPTOR_HASHES

   Maximum number of microdescriptors that can requested at a time by
   their hashes.

stem.descriptor.remote.HAS_V3IDENT(auth)

class class stem.descriptor.remote.Query(resource, descriptor_type=None, endpoints=None, retries=2, fall_back_to_authority=False, timeout=None, start=True, block=False, validate=False, document_handler='ENTRIES', **kwargs)

   Bases: "object"

   Asynchronous request for descriptor content from a directory
   authority or mirror. These can either be made through the
   "DescriptorDownloader" or directly for more advanced usage.

   To block on the response and get results either call "run()" or
   iterate over the Query. The "run()" method pass along any errors
   that arise...

      from stem.descriptor.remote import Query

      query = Query(
        '/tor/server/all.z',
        block = True,
        timeout = 30,
      )

      print 'Current relays:'

      if not query.error:
        for desc in query:
          print desc.fingerprint
      else:
        print 'Unable to retrieve the server descriptors: %s' % query.error

   ... while iterating fails silently...

      print 'Current relays:'

      for desc in Query('/tor/server/all.z', 'server-descriptor 1.0'):
        print desc.fingerprint

   In either case exceptions are available via our 'error' attribute.

   Tor provides quite a few different descriptor resources via its
   directory protocol (see section 4.2 and later of the dir-spec).
   Commonly useful ones include...

   +---------------------------------------+---------------------------------------------------+
   | Resource                              | Description                                       |
   +=======================================+===================================================+
   | /tor/server/all.z                     | all present server descriptors                    |
   +---------------------------------------+---------------------------------------------------+
   | /tor/server/fp/<fp1>+<fp2>+<fp3>.z    | server descriptors with the given fingerprints    |
   +---------------------------------------+---------------------------------------------------+
   | /tor/extra/all.z                      | all present extrainfo descriptors                 |
   +---------------------------------------+---------------------------------------------------+
   | /tor/extra/fp/<fp1>+<fp2>+<fp3>.z     | extrainfo descriptors with the given fingerprints |
   +---------------------------------------+---------------------------------------------------+
   | /tor/micro/d/<hash1>-<hash2>.z        | microdescriptors with the given hashes            |
   +---------------------------------------+---------------------------------------------------+
   | /tor/status-vote/current/consensus.z  | present consensus                                 |
   +---------------------------------------+---------------------------------------------------+
   | /tor/keys/all.z                       | key certificates for the authorities              |
   +---------------------------------------+---------------------------------------------------+
   | /tor/keys/fp/<v3ident1>+<v3ident2>.z  | key certificates for specific authorities         |
   +---------------------------------------+---------------------------------------------------+

   The '.z' suffix can be excluded to get a plaintext rather than
   compressed response. Compression is handled transparently, so this
   shouldn't matter to the caller.

   Variables:
      * **resource** (*str*) -- resource being fetched, such as
        '/tor/server/all.z'

      * **descriptor_type** (*str*) -- type of descriptors being
        fetched (for options see "parse_file()"), this is guessed from
        the resource if **None**

      * **endpoints** (*list*) -- (address, dirport) tuples of the
        authority or mirror we're querying, this uses authorities if
        undefined

      * **retries** (*int*) -- number of times to attempt the request
        if downloading it fails

      * **fall_back_to_authority** (*bool*) -- when retrying request
        issues the last request to a directory authority if **True**

      * **content** (*str*) -- downloaded descriptor content

      * **error** (*Exception*) -- exception if a problem occured

      * **is_done** (*bool*) -- flag that indicates if our request has
        finished

      * **download_url** (*str*) -- last url used to download the
        descriptor, this is unset until we've actually made a download
        attempt

      * **start_time** (*float*) -- unix timestamp when we first
        started running

      * **timeout** (*float*) -- duration before we'll time out our
        request

      * **runtime** (*float*) -- time our query took, this is **None**
        if it's not yet finished

      * **validate** (*bool*) -- checks the validity of the
        descriptor's content if **True**, skips these checks otherwise

      * **document_handler**
        (*stem.descriptor.__init__.DocumentHandler*) -- method in
        which to parse a "NetworkStatusDocument"

      * **kwargs** (*dict*) -- additional arguments for the descriptor
        constructor

   Parameters:
      * **start** (*bool*) -- start making the request when
        constructed (default is **True**)

      * **block** (*bool*) -- only return after the request has been
        completed, this is the same as running **query.run(True)**
        (default is **False**)

   start()

      Starts downloading the scriptors if we haven't started already.

   run(suppress=False)

      Blocks until our request is complete then provides the
      descriptors. If we haven't yet started our request then this
      does so.

      Parameters:
         **suppress** (*bool*) -- avoids raising exceptions if
         **True**

      Returns:
         list for the requested "Descriptor" instances

      Raises :
         Using the iterator can fail with the following if
         **suppress** is **False**...

            * **ValueError** if the descriptor contents is malformed

            * **socket.timeout** if our request timed out

            * **urllib2.URLError** for most request failures

         Note that the urllib2 module may fail with other exception
         types, in which case we'll pass it along.

class class stem.descriptor.remote.DescriptorDownloader(use_mirrors=False, **default_args)

   Bases: "object"

   Configurable class that issues "Query" instances on your behalf.

   Parameters:
      * **use_mirrors** (*bool*) -- downloads the present consensus
        and uses the directory mirrors to fetch future requests, this
        fails silently if the consensus cannot be downloaded

      * **default_args** -- default arguments for the "Query"
        constructor

   use_directory_mirrors()

      Downloads the present consensus and configures ourselves to use
      directory mirrors, in addition to authorities.

      Returns:
         "NetworkStatusDocumentV3" from which we got the directory
         mirrors

      Raises :
         **Exception** if unable to determine the directory mirrors

   get_server_descriptors(fingerprints=None, **query_args)

      Provides the server descriptors with the given fingerprints. If
      no fingerprints are provided then this returns all descriptors
      in the present consensus.

      Parameters:
         * **fingerprints** (*str,list*) -- fingerprint or list of
           fingerprints to be retrieved, gets all descriptors if
           **None**

         * **query_args** -- additional arguments for the "Query"
           constructor

      Returns:
         "Query" for the server descriptors

      Raises :
         **ValueError** if we request more than 96 descriptors by
         their fingerprints (this is due to a limit on the url length
         by squid proxies).

   get_extrainfo_descriptors(fingerprints=None, **query_args)

      Provides the extrainfo descriptors with the given fingerprints.
      If no fingerprints are provided then this returns all
      descriptors in the present consensus.

      Parameters:
         * **fingerprints** (*str,list*) -- fingerprint or list of
           fingerprints to be retrieved, gets all descriptors if
           **None**

         * **query_args** -- additional arguments for the "Query"
           constructor

      Returns:
         "Query" for the extrainfo descriptors

      Raises :
         **ValueError** if we request more than 96 descriptors by
         their fingerprints (this is due to a limit on the url length
         by squid proxies).

   get_microdescriptors(hashes, **query_args)

      Provides the microdescriptors with the given hashes. To get
      these see the 'microdescriptor_hashes' attribute of
      "RouterStatusEntryV3". Note that these are only provided via a
      microdescriptor consensus (such as 'cached-microdesc-consensus'
      in your data directory).

      Parameters:
         * **hashes** (*str,list*) -- microdescriptor hash or list of
           hashes to be retrieved

         * **query_args** -- additional arguments for the "Query"
           constructor

      Returns:
         "Query" for the microdescriptors

      Raises :
         **ValueError** if we request more than 92 microdescriptors by
         their hashes (this is due to a limit on the url length by
         squid proxies).

   get_consensus(authority_v3ident=None, **query_args)

      Provides the present router status entries.

      Parameters:
         * **authority_v3ident** (*str*) -- fingerprint of the
           authority key for which to get the consensus, see 'v3ident'
           in tor's config.c for the values.

         * **query_args** -- additional arguments for the "Query"
           constructor

      Returns:
         "Query" for the router status entries

   get_vote(authority, **query_args)

      Provides the present vote for a given directory authority.

      Parameters:
         * **authority** (*stem.descriptor.remote.DirectoryAuthority*)
           -- authority for which to retrieve a vote for

         * **query_args** -- additional arguments for the "Query"
           constructor

      Returns:
         "Query" for the router status entries

   get_key_certificates(authority_v3idents=None, **query_args)

      Provides the key certificates for authorities with the given
      fingerprints. If no fingerprints are provided then this returns
      all present key certificates.

      Parameters:
         * **authority_v3idents** (*str*) --

           fingerprint or list of fingerprints of the authority keys,
           see 'v3ident' in tor's config.c for the values.

         * **query_args** -- additional arguments for the "Query"
           constructor

      Returns:
         "Query" for the key certificates

      Raises :
         **ValueError** if we request more than 96 key certificates by
         their identity fingerprints (this is due to a limit on the
         url length by squid proxies).

   query(resource, **query_args)

      Issues a request for the given resource.

      Parameters:
         * **resource** (*str*) -- resource being fetched, such as
           '/tor/server/all.z'

         * **query_args** -- additional arguments for the "Query"
           constructor

      Returns:
         "Query" for the descriptors

      Raises :
         **ValueError** if resource is clearly invalid or the
         descriptor type can't be determined when 'descriptor_type' is
         **None**

class class stem.descriptor.remote.DirectoryAuthority(nickname=None, address=None, or_port=None, dir_port=None, is_bandwidth_authority=False, fingerprint=None, v3ident=None)

   Bases: "object"

   Tor directory authority, a special type of relay hardcoded into tor
   that enumerates the other relays within the network.

   At a very high level tor works as follows...

   1. A volunteer starts up a new tor relay, during which it sends a
      server descriptor to each of the directory authorities.

   2. Each hour the directory authorities make a vote that says who
      they think the active relays are in the network and some
      attributes about them.

   3. The directory authorities send each other their votes, and
      compile that into the consensus. This document is very similar
      to the votes, the only difference being that the majority of the
      authorities agree upon and sign this document. The idividual
      relay entries in the vote or consensus is called router status
      entries.

   4. Tor clients (people using the service) download the consensus
      from one of the authorities or a mirror to determine the active
      relays within the network. They in turn use this to construct
      their circuits and use the network.

   Changed in version 1.3.0: Added the is_bandwidth_authority
   attribute.

   Variables:
      * **nickname** (*str*) -- nickname of the authority

      * **address** (*str*) -- IP address of the authority, currently
        they're all IPv4 but this may not always be the case

      * **or_port** (*int*) -- port on which the relay services relay
        traffic

      * **dir_port** (*int*) -- port on which directory information is
        available

      * **fingerprint** (*str*) -- relay fingerprint

      * **v3ident** (*str*) -- identity key fingerprint used to sign
        votes and consensus

stem.descriptor.remote.get_authorities()

   Provides the Tor directory authority information as of **Tor on
   11/21/14**. The directory information hardcoded into Tor and
   occasionally changes, so the information this provides might not
   necessarily match your version of tor.

   Returns:
      dict of str nicknames to "DirectoryAuthority" instances
