Reading from System.IO.__ConsoleStream hangs forever

Issue #23 resolved
Alexander Kahl
created an issue

The attached, simple program reproduces a bug in FParsec (0.9.1) where any subsequent call to stream.Read() after the initial ones at CharStream.cs:526 causes the process to wait forever, either at Text.cs:116 or at CharStream.cs:695. I have so far failed to reproduce this behavior outside of FParsec, by utilizing simple, repeated calls to the STDIN stream; usually, only the very first call would block until any data is received, any subsequent calls without further input read zero bytes, returning immediately. So I can only //guess// that it must be the underlying implementation in FParsec that is somehow causing the problem.

The code I used to attempt reproduction of the behavior is simply {{{ let buffer = Array.init 4096 (fun _ -> 0uy) let stream = Console.OpenStandardInput() printfn "%d bytes read" (stream.Read(buffer, 0, buffer.Length)) printfn "%d bytes read" (stream.Read(buffer, 0, buffer.Length)) }}}

But that works flawlessly.

The attached program //should// block indeed if no input stream is provided, pipe in any input using e.g. //echo "Foo" | ReadBug// and attach a debugger to see the problem in action.

Comments (7)

  1. Stephan Tolksdorf

    Thank you for taking the time to submit this bug report.

    If a System.IO.Stream has reached the end, the Read method must return 0 when invoked. Apparently, the problem in your example is that the StandardInput doesn't report the end of the stream at the end of the piped input. With a parser like "many anyChar" this must lead to a hang, as the parser will eagerly try to consume the complete input stream. Most other parsers will hang too, though, since FParsec's CharStream class reads the input blockwise and will keep calling the Read method of the underlying System.IO.Stream until an internal buffer is filled or the method returns 0 to signal the end of the stream.

    Thus, to parse input from StandardInput, you'll either have to wrap the stream with your own System.IO.Stream class that properly signals the end of the stream, or you simply read the input into a string (or multiple ones) and then run the FParsec parser on the string(s).

  2. Alexander Kahl reporter
    • changed status to new

    Thank you very much for looking into this issue. I'm afraid however that ConsoleStream's Read() does return 0 upon each subsequent call after all input has been consumed, as seen in the description's sample code. And, using a fixed-size parser such as pstring "foo" leads to the same behavior, even before a match could be determined (you can easily very this by piping in a non-matching word). So what does make Read() block here?

    In case of using a wrapping stream class: What is the expected behavior in case of EOS? Returning 0? A special exception?

  3. Stephan Tolksdorf

    You're right, I didn't look into why your sample code directly calling stream.Read doesn't hang.

    The difference is that the current CharStream implementation will call stream.Read three times for small input streams: 2x at CharStream.cs:526 and 1x at Text.cs:116. Normally the two last calls both should return 0, but apparently the ConsoleStream implements some non-standard (buggy?) behaviour where the third call will actually try to read input again. You can reproduce this by adding a third printfn ... stream.Read... to your sample.

    Using pstring "foo" doesn't make a difference because the CharStream reads the input blockwise (in chunks of a fixed size, see the CharStream reference), as I tried to explain above.

    You could correct this issue in a wrapper class by consistently returning 0 at the end of the stream.

    I'll see if I can change the CharStream implementation to ensure that stream.Read is never called after it returned 0 once.

  4. Alexander Kahl reporter

    This is interesting! Calling Read() thrice in Mono does not trigger a hang; furthermore, IIRC feeding more input while .NET's Read() hangs does not cause execution to continue, if I've done everything right.

    I'm looking forward to your workaround, as parsing from input streams is essential for my downstream project.

  5. Log in to comment