收藏本站
收藏 | 投稿 | 手机打开
二维码
手机客户端打开本文

Improving cis-regulatory elements modeling by consensus scaffolded mixture models

JIANG HongShan  ZHAO Ying  CHEN WenGuang  ZHANG XueGong  
【摘要】:A position weight matrix(PWM) is widely accepted as a probabilistic representation for modeling protein-DNA binding specificity.Previous studies showed that for factors which bind to divergent binding sites,mixtures of multiple PWMs improve performance.We propose a consensus scaffolded mixutre PWM(CSM) model to improve cis-regulatory elements modeling by allowing overlapping components represented by a set of PWMs,each of which corresponds to a binding pattern and is scaffolded by a degenerate consensus.In addition,we propose a learning algorithm that involves an initial structure learning stage based on the frequent pattern mining and a refining stage based on the expectation maximization(EM) algorithm.We assess the merits of CSM using three independent criteria.In a case-study of transcription factor Leu3,the derived CSM models agree with conventional mixtures but show better fitness according to Fermi-Dirac distribution.Analysis of the human-mouse conservation of predicted binding sites of 83 JASPAR transcription factors(TFs) shows that the CSM is as good as or better than the simple mixture,the context-specific independent(CSI) mixture,and the single PWM model,for 83%,84%,and 75% of the cases,respectively.Five-fold cross validation on 46 TRANSFAC datasets shows that CSM model has better generality than other mixture models.

知网文化
中国知网广告投放
 快捷付款方式  订购知网充值卡  订购热线  帮助中心
  • 400-819-9993
  • 010-62982499
  • 010-62783978