安装python3
https://www.yuzhi100.com/tutorial/centos/centos-anzhuang-python36
1 2 3 4 5 6 7 8 9 10 11
| #安装EPEL依赖 sudo yum install epel-release
#安装IUS软件源 sudo yum install https://centos7.iuscommunity.org/ius-release.rpm
sudo yum install python36u sudo ln -s /bin/python3.6 /bin/python3
sudo yum install python36u-pip sudo ln -s /bin/pip3.6 /bin/pip3
|
安装airflow
1. 添加环境变量
1
| export SLUGIFY_USES_TEXT_UNIDECODE=yes
|
2. 环境安装
1 2 3 4 5 6
| sudo yum install python36u-devel.x86_64
sudo yum install mysql-community-devel.x86_64
# sasl/sasl.h: No such file or directory yum install gcc-c++ cyrus-sasl-devel.x86_64
|
3. 元数据库配置(mysql)
1 2 3 4 5 6 7 8 9
| -- xxxx
CREATE DATABASE airflow;
GRANT all privileges on airflow.* TO 'root'@'localhost' IDENTIFIED BY 'xxxx';
ALTER USER 'root'@'localhost' IDENTIFIED BY 'xxxx' PASSWORD EXPIRE NEVER;
ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY 'xxxx';
|
4. \$AIRFLOW_HOME/airflow.cfg文件配置
将AIRFLOW_HOME加入环境变量
1
| sql_alchemy_conn = mysql://root:xxxx@localhost:3306/airflow
|
1 2 3 4 5
| # -*- coding: utf-8 -*- from cryptography.fernet import Fernet
fernet_key= Fernet.generate_key() print(fernet_key) # your fernet_key, keep it in secured place!
|
1 2
| # 安装加密模块 pip install flask-bcrypt
|
暴露端口5001
5. 配置用户
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| # -*- coding: utf-8 -*- import airflow from airflow import models, settings from airflow.contrib.auth.backends.password_auth import PasswordUser
user = PasswordUser(models.User()) user.username = 'alithink' user.email = 'xxxx' user.password = 'xxxx' session = settings.Session() session.add(user) session.commit() session.close() exit()
|
启动airflow
1 2 3 4
| nohup airflow webserver -p 5001 &
# 每次resetdb后scheduler要重启 nohup airflow scheduler &
|
airflow tips
- cfg配置改变后要进行重启
- 默认utc时间,建议在dag配置的时候进行时区的考量(web ui只支持utc…)
- dag开关置为on之后,如果scheduler已启动,start-date到目前每个执行计划节点的任务都会依次执行。
- 可以点击立刻执行,进行手动dag执行。
- 每个节点的日志,可以点击对应task,然后查看task instance log
- UTC时间,需要在原本打算设置的时间减8小时
- catch_up: 如果指定的开始时间早于当前时间且catch_up设置为true,那么airflow会把过去‘遗漏’的调度执行一遍