Skip to content
HU Pili edited this page Jun 1, 2013 · 14 revisions

Coding Convention

This page notes the coding conventions of this project. The format is MarkDown to facilitate efficient communication of code snippets. There are two types of conventions:

  • General. That means those rules does not only apply to this project, but also many others having the same component(python, git, json, server, etc). For example, "Python General" section notes some general rules followed by most Python programmers. As long as one is skilled in Python, he may safely skip those tips. Those tips are here to help new comers jump start their trip in this project.

  • Specific. That means those rules does not universally apply to other projects. Examples are notation unifications. Some projects favour "this_is_a_variable" but others may favour "ThisIsAVariable". We can not generally tell which one is right or wrong, but we can unify them to guarantee consistency in this project.

Rules are usually developed to alleviate pitfalls of languages or humans. It is a good practice to place a short description of the reasons or reference links under the rules.

Text Specific

Source Code Encoding

All files are encoded in UTF-8

Line break

Use LF(Line Feed, ascii 10) only. Do not use CR(Carriage Return, ascii 13) before LF.

Python Indentation

Use four blanks(' ').

Json Indentation

Use two blanks(' ').

This rule is derived from current examples of '.json' files In this project. Whether it's the general practice or not needs further evidence.

Comments in Codes

"#" comments in the project. There are three types of comments in general

  • real comments, explain the rationales, frame the code into sections, etc;
  • option comments, like those in snsconf.py, not to explain things, just to notify the users some possibilities;
  • debugging comments, just for fast code recovery in development process.

After getting a stable version, the 3)-rd type should be removed to keep the code clean. (Git will remember them for us)

Stylization

Python community favours PEP8. It is recommended to format your code to conform to PEP8.

The "line too long" warning is not very reasonable since developing environments are wider nowadays. Also, some lines contain URL templating, making it such long. Other suggestions are reasonably to follow.

One can check like this:

find -name "*.py" | xargs pep8 | grep -v E501

There is also a tool autopep8 to help do this automatically.

Python General

Avoid Wildcard Import

Avoid:

from errors import *

Suggested style:

import errors
raise errors.SNSError

from errors import SNSError

With wildcard, you will never know you did a circular import Wildcard import VS Named import
and here are some older brother talk Should wildcard import be avoided

General Naming Convention

Here's a collection of general naming conventions:

  • Use "__funcname" to "enforce" private members. (See name mangling in reference)
  • Use "_funcname" to represent "protected" members. That is, expected to be accessed by all derived class but is not an interface to anything outside the current class. Python itself can not enforce the "protected" behaviour. We simply use the name to remind developers.

Reference: http://docs.python.org/tutorial/classes.html

Logging Convention

The baseline of our convention is drawn from the official reference. There are two general guidlines:

  • Be strict with console ouput. We'll develop a CLI in companion with snsapi. The console output is not only intended for human being. It's also the interface to the rest of the world. Imagine some use case like this:
python snscli.py --auth=all | perl dosomething.pl

Supposingly, only the result of authentication should be printed to stdout. That is to say, always keep the message in log unless you are certain that the output is expected by the other end of the pipe.

  • Conform to the general convention of Python's logging usage. Although we have a wrapper in snsapi, we do not want to build the convention from scratch. The existence of wrapper is to allow smooth transition to other backend if we need. No matter what backend we use, many things still apply, e.g. the definition of log levels.

Reference: http://docs.python.org/howto/logging.html#logging-basic-tutorial

Although we try to conform to the convention of Python community, the interpretation of the general rules may be different. See our specifics on logging, where we demonstrate the rules by some examples.

Python Specific

Naming: Module, Class, Variables, Methods, etc

  • Module name is all lower case, preferably a single word without underscore. e.g. sina.py, snspocket.py.
  • Class name is a combination of multiple words with first character capitalised. Long (but not too long) class name is allowed to keep the meaning of the class clear. e.g. SinaWeiboStatus.
  • Variable and Method all lower case. Use underscore to concatenate multiple words. e.g. read_channel()
  • Method is differentiated by their name. In C++ analogy: public --> funcname(); protected --> _funcname(); private --> __funcname().
  • Constant is all upper case. e.g. RENREN_AUTHORIZATION_URI.

Logging examples

Debug

In snsapi.py:

def get_saved_token(self):
    ...
    logger.debug("Saved Access token is expired, try to get one through sns.auth() :D")
    ...
    logger.debug("This channel is configured not to save token to file")
    ...
    logger.debug("No access token saved, try to get one through sns.auth() :D")

What the users(higher layer developers) concern is whether they pass authorization or not. In this function, we first determine whether there is a valid access token on local disk. If there is, we jump to the next transaction like getting the home_timeline. If not, we will leverage one kind of methods to perform OAuth. As you spot from the code pieces, there are three reasons that can trigger the snsapi to perform OAuth. The details are interesting to plugin developers but not so interesting to users.

Commented Debug

Commented debug is also debug message. We encourage people to use logger.debug in developments. However, when the project goes bigger and bigger, Different developers debugging messages will interfere with others. Sooner or later, you will find the log's SNIR is too low to get useful information. In this case, we also encourage the developers to comment out some debug message for non-general business after they fully tested their components.

e.g. In 'renren.py'

def renren_request(self, params = None):
	...
    #logger.debug("request response: %s", s)

This one logs the HTTP response. Sometimes, a single log will flood your console with 2 to 3 pages. Obviously, this debug message is not interesting to other developers.

We also encourage people to put a marker behind their message, like

def renren_request(self, params = None):
	...
    #logger.debug("request response: %s", s) #log.debug.hupili

This trick is absolutely useful. You can use sed to switch the messge on or off very quickly. In your own branch, switch them on and you can check the details of the details. Before pushing to master, swich them off. Then it won't bother other developers.

Info

Info is used to notify the users what the program is doing. Sometimes, it's a sign that the program is running as expected. Sometimes, you can include digest data for later statistics.

Examples from snsapi.py:

def _oauth2_second(self):
    ...
    logger.info("Channel '%s' is authorized", self.channel_name)
...
def oauth2(self):
    ...
    logger.info("Try to authenticate '%s' using OAuth2", self.channel_name)

This is the first type. By looking the two messages, you know what the program is doing.

def home_timeline(self, count=20):
    ....
    logger.info("Read %d statuses from '%s'", len(statuslist), self.channel_name)

This one is for both purposes. It records the status of last business.

Warning

There is only one instance of warning so far. (Actually, we should add more)

e.g. In renren.py:

logger.warning(response["error_msg"])

Warning is easy to understand. There is something wrong but the program is still alive. Like the above example, the request fails, probably your parameter is wrong, you're requesting too fast, etc. In either case, we should let the users know. They can slow down the invocation of requests, for example.

Warning or Warn?

In the official tutorial, warning is distinguished from warn. However, they does not show any difference in the log file.

e.g. run snslog:

$python snslog.py
[WARNING][20120828-162008][snslog.py][<module>][127]test: 123; str
[DEBUG][20120828-162008][snslog.py][<module>][128]test debug
[INFO][20120828-162008][snslog.py][<module>][129]test info
[WARNING][20120828-162008][snslog.py][<module>][130]test warning
[WARNING][20120828-162008][snslog.py][<module>][131]test warn
[ERROR][20120828-162008][snslog.py][<module>][132]test error
[CRITICAL][20120828-162008][snslog.py][<module>][133]test critical

In this case, there is no motivation to ponder whether warn or warning should be applied.

Error

We don't have instance for it at present. Error log should be added to some try clause, where we eats the failure message.

Critical

When there is critical error, the program itself can not continue.

e.g. In snsapi.py:

logger.critical("authClient init error")

I'm not sure whether this is critical or not. I just feel that this error should not happen. We are just creating an OAuth client class, and filling the init data. There is not network transaction. If it fails, then the configuration from the user is problematic. We don't want to continue in this case.

Inline Documentation

We use Sphinx to automatically generate the inline documentation from code. It is written in ReST Style

  • Related issue: #45

Git General

Commit

  • Commit frequently.
  • One commit <-> one small point.

Branching and Merging

  • Always keep 'master' workable.
  • Small modifications(like textual update/fix) are committed into 'master' directly. Others had better go through branching and merging.
  • Associate an issue of each branch. This helps others to trace things that are done in this branch. It also provides a board for people to discuss methodologies.
  • Use "issue???" as the branch name. (recommended by default)
  • Regularly check updates on 'master' and merge 'master' to your branch. (compared with cumulatively merging everything in the end, this convention help to reduce collisions)
  • Once the development of a branch is done, merge it back to the 'master'.

Reference: http://git-scm.com/book/en/Git-Branching-Basic-Branching-and-Merging

Special Branches

There are two special branches:

  • master. This branch faces upper layer developers and end users. It is by default well tested.
  • dev. this branch is for SNSAPI developers, including plugin developers. It is latest but imature.

Executive summary:

  • Hot bugfix, documentation upgrades, small modifications can go to 'master' directly.
  • For framework upgrade and new plugins, please send pull request to 'dev'.
  • Once 'dev' becomes stablized, merge it to master and tag it with a version number.

In total, 'dev' is like the 'master' for SNSAPI developers.

Git Specific

Tag

  • We use "vx.y.z" where "x", "y" and "z" are integers, to represent a released version. Using this notation the snsapi-website can automatically supply the download links of those versions.