UCenterHome1.5的模板语法解析


在UCH中，模板与动态数据分离，所以在很多php文件的最后，我们会看到包含了模板文件，如cp_blog.php最后有include_once template("cp_blog");
 
在下面的代码中，命名有规律。$tpl是没有后缀名的，$tplfile是后缀为htm的模板文件，$objfile是后缀为php的缓存文件
 
UCH里使用模板的流程是：
在php代码中获取动态数据，然后include_once template($tpl)
template函数解析模板文件$tplfile，返回缓存文件$objfile。
template函数中调用parse_template函数解析$tplfile，
$tplfile里有UCH定义的一套语法，在parse_template函数里可以看到，有<!-- template name--> <!—block/name-->等。这些语法在parse_template中都会被替换成对应的函数，如readtemplate blocktags等，这些函数也都位于function_template.php中。
这里有点点纠结的就是，模板文件中可能还通过<!-- template name-->包含了其他的模板文件，所以在parse_template中要两次调用preg_replace，第一次读模板文件$tplfile所包含的子模板文件name，第二次是读子模板文件name中再包含的孙子模板文件。UCH中至多有3层模板包含关系，即父亲->儿子->孙子，所以不需要第三次调用preg_replace来读取孙子模板文件可能包含的重孙模板文件了。
 
 
template函数在function_common.php中定义
function template($name) {
global $_SCONFIG, $_SGLOBAL;
 
if(strexists($name,/)) {
$tpl = $name;//$name是完整目录的情况
} else {
$tpl = "template/$_SCONFIG[template]/$name";
/*
$name只是一个文件名的情况，$tpl类似template/default/cp_blog或者template/blue/cp_blog
默认的模板风格是default，但是如果用户选择了其他风格，$_SCONFIG[template]就会变成blue之类的其他值
在首页的右下角可以选择模板风格

*/
}
$objfile = S_ROOT../data/tpl_cache/.str_replace(/,_,$tpl)..php;
/*
缓存文件名,$objfile类似data/tpl_cache/template_default_cp_blog.php或者data/tpl_cache/template_blue_cp_blog.php
*/
if(!file_exists($objfile)) {
include_once(S_ROOT../source/function_template.php);
parse_template($tpl);
//如果缓存文件不存在，则对模板文件进行解析
}
return $objfile;
}
 
parse_template函数在function_template.php中定义
function parse_template($tpl) {
global $_SGLOBAL, $_SC, $_SCONFIG;
 
//包含模板
$_SGLOBAL[sub_tpls] = array($tpl);
 
$tplfile = S_ROOT../.$tpl..htm;
/*
$tplfile类似template/default/cp_blog.htm或者template/blue/cp_blog.htm
*/
$objfile = S_ROOT../data/tpl_cache/.str_replace(/,_,$tpl)..php;
/*
$objfile类似data/tpl_cache/template_default_cp_blog.php或者data/tpl_cache/template_blue_cp_blog.php
*/
 
//read
if(!file_exists($tplfile)) {
$tplfile = str_replace(/.$_SCONFIG[template]./, /default/, $tplfile);
//如果非默认模板风格的某个模板文件不存在，那么就改用default风格的该模板文件
}
$template = sreadfile($tplfile);
//读入模板文件内容
if(empty($template)) {
exit("Template file : $tplfile Not found or have no access!");
}
 
//模板
$template = preg_replace("/<!--{templates+([a-z0-9_/]+)}-->/ie", "readtemplate(\1)", $template);
/*
这就是定义UCH的模板语法了，模板页中的<!--{template name}-->
被替换成readtemplate(name),readtemplate函数也在function_template.php中定义。name就是([a-zA-Z0-9_/]+)
为什么多了A-Z呢，因为"/<!--{templates+([a-z0-9_/]+)}-->/ie"最后的i选项表示不区分大小写的正则匹配
python里的正则表达式分组，似乎就是用1来表示第一组，这里用了\1
\1为什么又要用单引号裹起来呢，这是因为readtemplate函数的参数要是一个字符串
*/
//处理子页面中的代码
$template = preg_replace("/<!--{templates+([a-z0-9_/]+)}-->/ie", "readtemplate(\1)", $template);
//解析模块调用
$template = preg_replace("/<!--{block/(.+?)}-->/ie", "blocktags(\1)", $template);
/*
<!--{block/name}-->被替换成blocktags(name)
name就是(.+?)   .匹配除换行符外的任意字符，+表示出现一次或多次
?表示懒惰匹配，不然后面的} - >都会被.+匹配掉
*/
//解析广告
$template = preg_replace("/<!--{ad/(.+?)}-->/ie", "adtags(\1)", $template);
/*
<!--{ad/name}-->被替换成<!--adtags(name)-->
如space_doing.htm里有一个<!--{ad/header}-->被替换成了<!--AD_TAG_1-->
*/
 
//时间处理
$template = preg_replace("/<!--{date((.+?))}-->/ie", "datetags(\1)", $template);
/*
<!--{date(name)}-->被替换成datetags(name)
如space_doing.htm里有一个
<!--{date(m-d H:i,$basevalue[dateline],1)}-->被替换成了<!--DATE_TAG_7-->
*/
//头像处理
$template = preg_replace("/<!--{avatar((.+?))}-->/ie", "avatartags(\1)", $template);
/*
<!--{avatar(name)}-->被替换成avatartags(name)
如space_doing.htm里有一个<!--{avatar($_SGLOBAL[supe_uid],small)}-->被替换成了<!--AVATAR_TAG_10-->
*/
//PHP代码
$template = preg_replace("/<!--{evals+(.+?)s*}-->/ies", "evaltags(\1)", $template);
/*
<!--{eval php_expression}-->被替换成evaltags(php_expression)
php_expression就是(.+?) 而且这里的.匹配包括换行符在内的一切字符 这是由/ies中的s选项确定的
如space_doing.htm里有一个<!--{eval echo formhash();}-->被替换成了<!--EVAL_TAG_16-->
*/
    
//开始处理
//变量
$var_regexp = "((\$[a-zA-Z_x7f-xff][a-zA-Z0-9_x7f-xff]*)([[a-zA-Z0-9_-."\[]$x7f-xff]+])*)";
/*
上面用红色显示的是转义符，没用红色显示的\就还是反斜杠
但是一开始为什么要用三个，我搞不明白。试验了一下，\$匹配的就是$，而改成$却没能匹配模板文件里的变量名。这个我还没搞明白。
$var_regexp匹配变量名
[a-zA-Z_x7f-xff] 变量名以字母或下划线或汉字开头
x7f-xff这个让人非常困惑，到底是想匹配什么呢？难道是想匹配扩展ASCII码吗？0x7f-0xff确实是扩展ASCII码表的范围，但显然没人会用这些控制字符去当变量名。
我这个UCH是UTF-8版本的，从UCS-2（就是现在通用的Unicode）到UTF-8的编码方式如下：
UCS-2编码(16进制)
 UTF-8 字节流(二进制)
 
0000 - 007F
 0xxxxxxx
 
0080 - 07FF
 110xxxxx 10xxxxxx
 
0800 - FFFF
 1110xxxx 10xxxxxx 10xxxxxx
 
前面的0000-007F是兼容ASCII码的
汉字的unicode编码在0080-FFFF里，转换成UTF-8后的字节是
110xxxxx或10xxxxxx或1110xxxx，在0x01111111-0x11111111范围内，也即0x7f-0xff。
更精确地，这个匹配变量名开头的部分可以写成[a-zA-Z_xc0-xef],因为汉字的开始字节只可能是110xxxxx或1110xxxx，范围就是0xc0-0xdf和0xe0-0xef
[a-zA-Z0-9_x7f-xff]* 比变量名的开头字节的允许取值范围多了数字，和C语言是一样的，变量名不能以数字开头
更精确地，这里可以写成[a-zA-Z0-9_x80-xef]* 因为这里汉字的字节可能是110xxxxx或1110xxxx或10xxxxxx，比0xc0-0xef多了10xxxxxx，即0x80-0xbf，合起来就是0x80-0xef
 
([[a-zA-Z0-9_-."\[]$x7f-xff]+])*
变量可能是数组形式的，如$name1[$name2]，$[a-zA-Z_x7f-xff][a-zA-Z0-9_x7f-xff]*匹配了$name1，后面的[$name2]怎么匹配呢？我一开始以为是[$[a-zA-Z_x7f-xff][a-zA-Z0-9_x7f-xff]],因为[]里面也应该是一个变量名。
但是变量可能是嵌套数组形式的，如$name1[$name2[$name3[$name4]]]，此时用$[a-zA-Z_x7f-xff][a-zA-Z0-9_x7f-xff]来一个个匹配变量名明显不可行。所以只是用[]把可能的字符都框起来，希望程序员不要写太扯淡的变量名了。
*/
$template = preg_replace("/<!--{(.+?)}-->/s", "{\1}", $template);
/*
形如<!--{name}-->的字符串被替换成{name}
前面 解析广告，时间处理等产生的<!--EVAL_TAG_15-->等都不匹配，仍然保留
name就是(.+?) 这里的.匹配包括换行符在内的所有字符（有/s选项）   ?表示懒惰匹配
*/
$template = preg_replace("/([
]+)	+/s", "\1", $template);
/*
去掉换行回车后的制表符
beyondcompare中的对比效果如下

*/
$template = preg_replace("/(\$[a-zA-Z0-9_[

补充：软件开发 , C语言 ,