上QQ阅读APP看书，第一时间看更新

Building the Twitter voting application

Now we have our environment set up and we have seen how to create an app on Twitter and perform three-legged authentication, it is time to get right into building the actual application that will count the Twitter votes.

We start off by creating a model class that will represent a hashtag. Create a file called hashtag.py in the twittervotes/core/twitter directory with the following content:

class Hashtag:
    def __init__(self, name):
        self.name = name
        self.total = 0
        self.refresh_url = None

This is a very simple class. We can pass a name as an argument to the initializer; the name is the hashtag without the hash sign (#). In the initializer, we define a few properties: the name, which will be set to the argument that we pass to the initializer, then a property called total that will keep the hashtag usage count for us.

Finally, we set the refresh_url. The refresh_url is going to be used to perform queries to the Twitter API, and the interesting part here is that the refresh_url already contains the id of the latest tweet that has been returned, so we can use that to fetch only tweets that we haven't already fetched, to avoid counting the same tweet multiple times.

The refresh_url looks like the following:

refresh_url': '?since_id=963341767532834817&q=%23python&result_type=mixed&include_entities=1

Now we can open the file __init__.py in the twittervotes/core/twitter directory and import the class that we just created, as follows:

from .hashtag import Hashtag

Perfect! Now go ahead and create a file called request.py in the twittervotes/core/ directory.

As usual, we start adding some imports:

import oauth2 as oauth
import time
from urllib.parse import parse_qsl
import json

import requests

from .config import read_config
from .config import read_reqauth

First, we import the oauth2 package that we are going to use to perform authentication; we prepare the request, signing it with the SHA1 key. We also import time to set the OAuth timestamp setting. We import the function parse_qsl, which we are going to use to parse a query string so we can prepare a new request to search for the latest tweets, and the json module so we can deserialize the JSON data that the Twitter API sends back to us.

Then, we import our own functions, read_config and read_req_auth, so we can read both configuration files. Lastly, we import the json package to parse the results and the requests package to perform the actual request to the Twitter search endpoint:

def prepare_request(url, url_params):
    reqconfig = read_reqauth()
    config = read_config()

    token = oauth.Token(
        key=reqconfig.oauth_token,
        secret=reqconfig.oauth_token_secret)

    consumer = oauth.Consumer(
        key=config.consumer_key,
        secret=config.consumer_secret)

    params = {
        'oauth_version': "1.0",
        'oauth_nonce': oauth.generate_nonce(),
        'oauth_timestamp': str(int(time.time()))
    }

    params['oauth_token'] = token.key
    params['oauth_consumer_key'] = consumer.key

    params.update(url_params)

    req = oauth.Request(method="GET", url=url, parameters=params)

    signature_method = oauth.SignatureMethod_HMAC_SHA1()
    req.sign_request(signature_method, consumer, token)

    return req.to_url()

This function will read both configuration files—the config.org configuration file contains all the endpoint URLs that we need, and also the consumer keys. The .twitterauth file contains the oauth_token and oauth_token_secret that we will use to create a Token object that we will pass along with our request.

After that, we define some parameters. oauth_version should, according to the Twitter API documentation, always be set to 1.0. We also send oauth_nonce, which is a unique token that we must generate for every request, and lastly, oauth_timestamp, which is the time at which the request was created. Twitter will reject a request that was created too long before sending the request.

The last thing that we attach to the parameters is oauth_token, which is the token that is stored in the .twitterath file, and the consumer key, which is the key that was stored in the config.yaml file.

We perform a request to get an authorization and if everything goes right, we sign the request with an SHA1 key and return the URL of the request.

Now we are going to add the function that will perform a request to search for a specific hashtag and return the results to us. Let's go ahead and add another function called execute_request:

def execute_request(hashtag):
    config = read_config()

    if hashtag.refresh_url:
        refresh_url = hashtag.refresh_url[1:]
        url_params = dict(parse_qsl(refresh_url))
    else:
        url_params = {
            'q': f'#{hashtag.name}',
            'result_type': 'mixed'
        }

    url = prepare_request(config.search_endpoint, url_params)

    data = requests.get(url)

    results = json.loads(data.text)

    return (hashtag, results, )

This function will get a Hashtag object as an argument and the first thing we do in this function is to read the configuration file. Then we check whether the Hashtag object has a value in the refresh_url property; in that case, we are going remove the ? sign in the front of the refresh_url string.

After that, we use the function parse_qsl to parse the query string and return a list of tuples where the first item in the tuple is the name of the parameter and the second is its value. For example, let's say we have a query string that looks like this:

'param1=1&param2=2&param3=3'

If we use the parse_qsl, passing as an argument this query string, we will get the following list:

[('param1', '1'), ('param2', '2'), ('param3', '3')]

And then if we pass this result to the dict function, we will get a dictionary like this:

{'param1': '1', 'param2': '2', 'param3': '3'}

As I showed before, the refresh_url has the following format:

refresh_url': '?since_id=963341767532834817&q=%23python&result_type=mixed&include_entities=1

And after parsing and transforming it into a dictionary, we can use it to get refreshed data for the underlying hashtag.

If the Hashtag object does not have the property refresh_url set, then we simply define a dictionary where the q is the hashtag name and the result type is set to mixed to tell the Twitter API that it should return popular, recent, and real-time tweets.

After defining the search parameters, we use the prepare_request function that we created above to authorize the request and sign it; when we get the URL back, we perform the request using the URL we get back from the prepare_request function.

We make use of the json.loads function to parse the JSON data and return a tuple containing the first item, the hashtag itself; the second item will be the results we get back from the request.

The final touch, as usual, is to import the execute_request function in the __init__.py file in the core module:

from .request import execute_request

Let's see how that works in the Python REPL:

The output above is much bigger than this but a lot of it has been omitted; I just wanted to demonstrate how the function works.

本周热推：

网络安全技术：网络空间健康发展的保障网络利他行为研究：积极心理学的视角冲击：5G如何改变世界物联网长距离无线通信技术应用与开发链接与分享：中国互联网站分类研究