|
4
|
An issue with socket.recv is how to know when you are done receiving data. A TCP stream guarantees the bytes will not arrive out of order or be sent more than once. But you do not know the size of the data that will be sent to you. 100 bytes could be sent as group of 10 bytes or maybe in one shot. Ultimately, this means you have to use a loop in some fashion until you know it is done. The basic recv returns an empty string when the socket is disconnected. From that you can build a simple loop that will work as long as the sender manages to disconnect the socket at the appropriate time. However, there could be situations where a local error will mask as a clean shutdown or maybe a close() is never called. Three very basic methods are shown below that try to fix that problem. They use either a time-based, end marker, or size of payload method. Since you cannot be sure just what you are going to receive, you have to be careful that you get enough of a message to determine the size of payload or end marker. I updated the recv_size method to allocate data in larger chunks if it gets a large stream of data, which can increase performance.
I employ a trivial server, to keep this as simple as possible. Just uncomment the type of receiving server you want to use to see the recv type chosen in action. The recv_timeout function, which uses non-blocking sockets, will continue trying to get data as long as the client manages to even send a single byte. This is useful for moving data which you know very little about (like encrypted data), so cannot check for completion in a sane way. The recv_end function tacks on an end marker understood by both the client and the server. One problem with this is that data cannot look like the marker. The recv_size function looks for the size of the payload to be prepended to the data. I use a fixed length, 4 bytes. So then I know to always look for that. It is packed to allow a terse representation that can fit sys.maxint. This avoids the problem of data looking like the marker, however, even if it means a lot of data, you are limited w/the payload being the maximum number that can be packed. An interesting advantage of this method is that it can allocate data in larger chunks since it knows this size of the data ahead of time. For large streams of data, I saw it increase performace by 10 times. To test this, in a another process, try using the functions that match with the server type. send_size(data) #for recv_size send_end(data) #for recv_end sock.sendall(data) #for timeout or simple recv(8192) do not forget to close if you do a raw sendallsock.close()
Tags: network
|
3 comments
Add a comment
Sign in to comment
Download
Copy to clipboard

For the recv_size() method, the client can not send messages too quickly, at least 0.1s interval between two messages. Otherwise the later message will be lost.
e.g.
socket.send(struct.pack('>i', len('hello')) + 'hello') socket.send(struct.pack('>i', len('how are you?')) + 'how are you?')
the second message 'bey' will be lost.
socket.send(struct.pack('>i', len('hello')) + 'hello') time.sleep(0.1) socket.send(struct.pack('>i', len('how are you?')) + 'how are you?')
The recv_end code has a bug:
If one recv() returns 'something useable ', and the next recv() returns 'as an ', and the third recv() returns 'end marker', the marker will not be recognized.
Changing End to a single character (e.g. '\n' or '\0') would fix this.
Another problem: recv_end should maintain a buffer across calls, so you don't end up throwing out the beginning of the next message.
I don't like to see bare
except:clauses. It should at the very least beexcept socket.error: