_GLOBAL_DEFAULT_TIMEOUT in socket.py

In python 2.6, the timeout attribute of the socket from various higher level modules (like urllib2) is passed as socket._GLOBAL_DEFAULT_TIMEOUT.
If you look at socket.py in Lib, you will find that

_GLOBAL_DEFAULT_TIMEOUT = object()


a) This kind of a construct was new to me, why should _GLOBAL_DEFAULT_TIMEOUT be set to an object() than say None.

Update: Because None for a timeout is a special value that would set the socket in blocking mode. That is the reason None is not used.

b) If this is just for a global value holding, why not just


global _GLOBAL_DEFAULT_TIMEOUT



Here is my understanding, the reason for having _GLOBAL_DEFAULT_TIMEOUT is to have a global default timeout value that can be accessible via various application layer modules (like ftplib, httplib, urllib).

Update: Correct

But the default value of the timeout is written at the lower layer socket module and is returned by
socket.getdefaulttimeout()
and may be set via socket.setdefaulttimeout or via timeout parameters at the application opening functions (like urllib2.urlopen('url',timeout=42).


Update: So what?. Wrong.


Get socket.getdefaulttimeout() is implemented in C; and to be preserve the value between C methods and the python library methods, an object() would be essential instead of a global.

object() might essentially provide an address location for the variable which would be global.


Am I confused? The way the C implementation works is the value is set to a global variable inside of C and its accessed by interfaces like getdefaulttime and setdefaulttime.

Nothing to maintain global values between Python and modules written in C. Are they sharable? I dont know.

This kind of construct is used to have a object to maintain a global state.

global myobj is just a declaration to the compiler that it is global. declared somewhere else and start using the same value. It is not setting things or even defining for the myobj to be used as global. OKay?


a = 100
print id(a)

def foo():
global a
print a, id(a)

foo()



[12:17:45 senthil]$python a1.py
23379248
100 23379248


So, in effect:

_GLOBAL_DEFAULT_TIMEOUT = object() is a just creating a empty object whose value can be shared.

Thats it and None is not a good option because, None in this case means that socket be a blocking one.


Here are some experimentations with urllib2 and new socket._GLOBAL_DEFAULT_TIMEOUT



Python 2.7a0 (trunk:72879M, May 24 2009, 12:51:19)
[GCC 4.3.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket
>>> print socket.getdefaulttimeout()
None
>>> socket.setdefaulttimeout(42)
>>> print socket.getdefaulttimeout()
42.0
>>> import urllib2
>>> obj = urllib2.urlopen("http://www.google.com")
>>> dir(obj)
['__doc__', '__init__', '__iter__', '__module__', '__repr__', 'close', 'code', 'fileno', 'fp', 'getcode', 'geturl', 'headers', 'info', 'msg', 'next', 'read', 'readline', 'readlines', 'url']
>>> dir(obj.fp)
['__class__', '__del__', '__delattr__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__iter__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__slots__', '__str__', '__subclasshook__', '_close', '_getclosed', '_rbuf', '_rbufsize', '_sock', '_wbuf', '_wbuf_len', '_wbufsize', 'bufsize', 'close', 'closed', 'default_bufsize', 'fileno', 'flush', 'mode', 'name', 'next', 'read', 'readline', 'readlines', 'softspace', 'write', 'writelines']
>>> dir(obj.fp._sock)
['__doc__', '__init__', '__module__', '_check_close', '_method', '_read_chunked', '_read_status', '_safe_read', 'begin', 'chunk_left', 'chunked', 'close', 'debuglevel', 'fp', 'getheader', 'getheaders', 'isclosed', 'length', 'msg', 'read', 'reason', 'recv', 'status', 'strict', 'version', 'will_close']
>>> dir(obj.fp._sock.fp)
['__class__', '__del__', '__delattr__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__iter__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__slots__', '__str__', '__subclasshook__', '_close', '_getclosed', '_rbuf', '_rbufsize', '_sock', '_wbuf', '_wbuf_len', '_wbufsize', 'bufsize', 'close', 'closed', 'default_bufsize', 'fileno', 'flush', 'mode', 'name', 'next', 'read', 'readline', 'readlines', 'softspace', 'write', 'writelines']
>>> dir(obj.fp._sock.fp._sock)
['__class__', '__delattr__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'accept', 'bind', 'close', 'connect', 'connect_ex', 'dup', 'family', 'fileno', 'getpeername', 'getsockname', 'getsockopt', 'gettimeout', 'listen', 'makefile', 'proto', 'recv', 'recv_into', 'recvfrom', 'recvfrom_into', 'send', 'sendall', 'sendto', 'setblocking', 'setsockopt', 'settimeout', 'shutdown', 'timeout', 'type']
>>> dir(obj.fp._sock.fp._sock.gettimeout)
['__call__', '__class__', '__cmp__', '__delattr__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__name__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__self__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__']
>>> print(obj.fp._sock.fp._sock.gettimeout)

>>> print(obj.fp._sock.fp._sock.gettimeout())
42.0

Redirect business again

.htaccess file has

Redirect 302 /index.html "http://localhost/new file.html"

But still when I do:

>>>obj = urllib2.urlopen("http://localhost/index.html")
>>>print obj.code()
200
>>>
Funny, what am I doing and why am I getting it this way? Figuring that out. Because the direct is happening transparently, one is not able to capture the redirect code.

If one needs to the capture redirect code, here is how it is done.


import urllib2

class SmartRedirectHandler(urllib2.HTTPRedirectHandler):
def http_error_302(self, req, fp, code, msg, headers):
result = urllib2.HTTPRedirectHandler.http_error_302(self, req, fp,code, msg,headers)

result.status = code
return result

request = urllib2.Request("http://localhost/index.html")
opener = urllib2.build_opener(SmartRedirectHandler())
obj = opener.open(request)
print 'I capture the http redirect code:', obj.status
print 'Its been redirected to:', obj.url


And the output from the session will be:


[06:59:14 senthil]$python smartredirecthandler.py
I capture the http redirect code: 302
Its been redirected to: http://localhost/new%20file.html

Setting up http redirect

I was trying to setup a http redirect on ubuntu. Spent quite some time.
1) Ubuntu has a
/etc/apache2/sites-available/default
file where you will have to change the AllowOverride option from None to All.
Only this step will enable you to use .htaccess file.
2) In the .htaccess file, I was doing

Redirect 302 ./index.html http://localhost/new.html


Spent more than a hour to figure out, why the redirect is not happening. The problem was, I was doing
./index.html
instead of just
/index.html

Bytes and String in Py3k

Martin's Explaination:

It's really very similar to 2.x: the "bytes" type is to used in all
interfaces that operate on byte sequences that may or may not represent characters; in particular, for interface where the operating system deliberately uses bytes - ie. low-level file IO and socket IO; also for cases where the encoding is embedded in the stream that still needs to be processed (e.g. XML parsing).

(Unicode) strings should be used where the data is truly text by
nature, i.e. where no encoding information is necessary to find out
what characters are intended. It's used on interfaces where the
encoding is known (e.g. text IO, where the encoding is specified
on opening, XML parser results, with the declared encoding, and
GUI libraries, which naturally expect text).

- base64.encodestring expects bytes (naturally, since it is supposed to
encode arbitrary binary data), and produces bytes (debatably)
- binascii.b2a_hex likewise (expect and produce bytes)
- pickle.dumps produces bytes (uniformly, both for binary and text
pickles)
- marshal.dumps likewise
- email.message.Message().as_string produces a (unicode) string
(see Barry's recent thread on whether that's a good thing; the
email package hasn't been fully ported to 3k, either)
- the XML libraries (continue to) parse bytes, and produce
Unicode strings
- for the IO libraries, see above

Backport this issue http://bugs.python.org/issue5542

update urlparse to RFC 3986

Targetting to complete this by sprint. Issue1591035.

Does parse_header really belong to cgi?

http://bugs.python.org/issue3609

Barry's thought is it can be moved to email module.

Bugs Fixed

Fixed Bugs:
http://bugs.python.org/issue4675
http://bugs.python.org/issue4962

Simple CGIHTTPServer and Client in Python 3k

Python 3k - CGIHTTPRequestHandler example.


import http.server
import http.server

class Handler(http.server.CGIHTTPRequestHandler):
cgi_directories = ["/cgi"]

server = http.server.HTTPServer(("",8000),Handler)
server.serve_forever()


Run this server.

Create a Directory 'cgi' in the directory you have the above server script and place the following python cgi code.

filename: test.py

#!/usr/local/bin/python3.1

import cgitb;cgitb.enable()

print("Content-type: text/html")
print()
print("<title>CGI 101</title>")
print("<h1>First CGI Example</h1>")
print("<p>Hello, CGI World!</p>")



Open the Browser and point to http://localhost:8000/cgi/test.py

file opening modes.

Possible values for the mode open are

"r" or "rt"
Open for reading in text mode.
"w" or "wt"
Open for writing in text mode.
"a" or "at"
Open for appending in text mode.
"rb"
Open for reading in binary mode.
"wb"
Open for writing in binary mode.
"ab"
Open for appending in binary mode.
"r+", "r+b"
Open for reading and writing.
"w+", "w+b"
Open for reading and writing, truncating file initially.
"a+", "a+b"
Open for reading and appending.