some interment failures when run as a group. Test should be more robust for empty return values and fail explicitly if a response for streaming is malformed.