You can still save big on Microsoft’s Surface PCs with these Cyber ​​Monday deals

Microsoft’s Surface PCs are some of the best on the market, but they don’t always come cheap. These are premium devices with a premium price tag, but thankfully, Cyber ​​Monday deals are here to help make things a little easier on the wallet. You can save a lot of money on different Surface devices right … Read more

On Some Aspects of Big Data Processing in Apache Spark, Part 4: Versatile JSON and YAML Parsers

In my previous post, I presented design patterns to program Spark applications in a modular, maintainable, and serializable way—this time I demonstrate how to configure versatile JSON and YAML parsers to be used in Spark applications. A Spark application typically needs to ingest JSON data to transform the data, and then save the data in … Read more

On Some Aspects of Big Data Processing in Apache Spark, Part 3: How To Deal With Malformed Data?

In my previous post, I presented design patterns to program Spark applications in a modular, maintainable, and serializable way. This time I demonstrate a solution to deal with malformed date/time data, and how to set a default value to malformed data. When I worked on a big data project, my tasks were to load data … Read more

Data Pipeline Orchestration – DZone Big Data

The spectrum of tasks data scientists and engineers need to do today is tremendous. The best description of it you will find in this article: The AI ​​Hierarchy of Needs. A long road of technical problems needs to be completed before you can start working on the real business problem and build valuable data products. … Read more

Big Data Processing, Apache Spark: Design Patterns

In my previous post, I demonstrated how Spark creates and serializes tasks. In this post, I show how to utilize this knowledge to construct Spark applications in a maintainable and upgradable way, where at the same time “task not serializable” exceptions are avoided. When I participated in a big data project, I needed to program … Read more

Does Datameer Support a Full Big Data Analysis Process?

Over the last few days, I had the chance to test Datameer analytics solution (das). Das is a platform for Hadoop which includes data source integration, an analytics engine, and visualization functionality. This promise of a fully integrated big data analysis process motivated me to test the product. It really includes all required functionality for … Read more

Azure Databricks Automated Testing – DZone Big Data

We all know how important data testing is in this digital transformation world. ETL testing mainly consists of ensuring that data has safely traveled from its source to its destination. Data processing is prone to errors, and you may end up with some data loss, corrupted, or irrelevant data as a result of various issues … Read more

How to Enable Big Data Best Practices

Why do you need Big Data best practices? Whether it’s government, private organizations, military, healthcare, or agencies such as NASA, there is no better option to meet analytics needs, increase productivity, and become more efficient. Big Data has become the fulcrum on which organizations of all sizes and shapes revolve for quite some time now. … Read more

Designing a Multi-Language Database – DZone Big Data

Today we look at three best-practice database designs to store data in multiple languages ​​and easily scale to new markets. Reaching millions of users with an application is every developer’s dream. Achieving this goal becomes easier if users from all over the world can use your application. Since not all users know English or your … Read more