c# - Intermittent aborted connections, when Web Scrapping using WebClient -
sometimes error, , data site
error:
wyjątek nieobsłużony: system.net.webexception: Żądanie zostało przerwane: połączenie zostało nieoczekiwanie zakończone. w system.net.connectstream.read(byte[] buffer, int32 offset, int32 size) w system.io.streamreader.readbuffer() w system.io.streamreader.readline() w consoleapplication3.download.geturldata() w c:\users\user\documents\visual studio 2013\projects\consoleapplication3\consoleapplication3\program.cs:wiersz 4 1 w consoleapplication3.program.main(string[] args) w c:\users\user\documents\v isual studio 2013\projects\consoleapplication3\consoleapplication3\program.cs:wi ersz 55
english translation: request aborted: connection unexpectedly terminated
my code:
public string geturldata() { webclient client = new webclient(); random r = new random(); //random ip address client.headers["x-forwarded-for"] = r.next(0, 255) + "." + r.next(0, 255) + "." + r.next(0, 255) + "." + r.next(0, 255); //random user-agent client.headers["user-agent"] = "mozilla/" + r.next(3, 5) + ".0 (windows nt " + r.next(3, 5) + "." + r.next(0, 2) + "; rv:2.0.1) gecko/20100101 firefox/" + r.next(3, 5) + "." + r.next(0, 5) + "." + r.next(0, 5); stream datastream = client.openread(url); streamreader reader = new streamreader(datastream); stringbuilder sb = new stringbuilder(); while (!reader.endofstream) sb.append(reader.readline()); return sb.tostring(); }
and main:
var d = new download("http://wiocha.pl"); var str = d.geturldata(); console.writeline(str);
what data every time without error?
limit on max conncurrent clients
i know limitation httpwebrequest: can have 1 or 2 active client requests @ time. not familiar webclient way, must actively dispose of httpwebresponse object when using httpwebrequest way. therefore, suggest first try disposing datastream before returning string.
not programming
it looks web scrapping, , cutting off. need to:
- limit how send requests, and;
- vary source ip using other machines/networks different ips and/or use of vpns.
another improvement
also, there's easier way input stream single string: var responsestring = reader.readtoend();
. wouldn't need stringbuilder or while loop.
Comments
Post a Comment