Original discussion is here.
Firehose is designed to support any file formats, but FirehoseFactory isn't: it should be associated to a specific type of InputRowParser at compile time. This means that we should implement different FirehoseFactorys for each various file formats which has a significant limitation on extensibility.
As intended from the first place, FirehoseFactory should specify how to make a new Firehose specifying how data comes to Druid (like downloading S3 objects from static s3 firehose), while InputRowParser specifies how Druid parses incoming data through Firehose.
Original discussion is here.
Firehoseis designed to support any file formats, butFirehoseFactoryisn't: it should be associated to a specific type ofInputRowParserat compile time. This means that we should implement differentFirehoseFactorys for each various file formats which has a significant limitation on extensibility.As intended from the first place,
FirehoseFactoryshould specify how to make a newFirehosespecifying how data comes to Druid (like downloading S3 objects from static s3 firehose), whileInputRowParserspecifies how Druid parses incoming data throughFirehose.