Recently I'm working in Azure to implement ETL jobs. The main tool is ADF (Azure Data Factory). This post show some solutions to resolve issue in my w...
Archive - 2020
2020
scala ref(https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html) create dataframe <!-- more -->
Spark run faster and faster - Cluster Optimization - Parameters Optimization - Code Optimization <!-- more --> Cluster Optimization Locality Level...
<!-- more --> Spark Submit options txt --master MASTER_URL --> 运行模式 例:spark://host:port, mesos://host:port, yarn, or local. --deploy-mode DEPLOY_...
<!-- more --> Code snippet py import airflow from airflow.models import DAG from airflow.operators.python_operator import PythonOperator defa...
Whitening Transformation
Spark Structured Streaming Recently reading a blog Structured Streaming in PySpark(https://hackersandslackers.com/structured-streaming-in-pyspark/) I...
Batch Normalization is one of important parts in our NN. Why need Normalization This paper title tells me the reason Batch Normalization: Accelerati...
gradient-based optimization algorithms <!-- more --> Gradient Descent variants Batch Gradient Descent (BGD) Vanilla gradient descent, aka batch g...