MTECH PROJECTS
A Scalable Big Data Test Framework This paper identifies three problems when testing software that uses Hadoop-based big datatechniques. First, processing big data takes a long time. Second, big data is transferred and transformed among many services. Do we need to validate the data at every transition point? Third, how should we validate the transferred and transformed data? We are developing a novel big data test framework to address these problems. The test framework generates a small and representative dataset from an original large data set using input space partition testing. Using this data set for development and testing would not hinder the continuous integration and delivery when using agile processes. The test framework also accesses and validates data at various transition points when datais transferred and transformed.