MEME update: 4.8.1
Hi all,
I just updated our MEME server to the latest version (4.8.1).
This new version comes with a new tool available on the web interface: CentriMo. It is dedicated to the analysis of ChIP-Seq data.
Havefun with it!
Mobyle@GenOuest - BioInformatics program portal |
|
|
Welcome to Mobyle, a portal for bioinformatics analysesDr Motifs - BlogMEME update: 4.8.1Hi all, I just updated our MEME server to the latest version (4.8.1). This new version comes with a new tool available on the web interface: CentriMo. It is dedicated to the analysis of ChIP-Seq data. Havefun with it!
Hi all, I just updated our MEME server to the latest version (4.8.1). This new version comes with a new tool available on the web interface: CentriMo. It is dedicated to the analysis of ChIP-Seq data. Havefun with it! MEME 4.7.0 updateHello, I just updated our MEME server to the latest version: 4.7.0. As usual, the web interface is available at the same address: http://tools.genouest.org/tools/meme/. The main change in this version is the new DREME web service and web interface. This tool was already used by the MEME-ChIP tool that was introduced in the MEME suite [...]
Hello, I just updated our MEME server to the latest version: 4.7.0. As usual, the web interface is available at the same address: http://tools.genouest.org/tools/meme/. The main change in this version is the new DREME web service and web interface. This tool was already used by the MEME-ChIP tool that was introduced in the MEME suite 4.6. It allows to discover motifs in sets of short (~100bp) sequences, like in ChIP dataset for example. If you want to learn how to use this tool, read the DREME tutorial which is online now. Have a look at the release notes to see the whole changelog. That’s all for today! Blast+ web interface: source code availableHi there! It’s been a while, but today I come with some good news! As you may remember, I have developed a web interface for Blast+. Some of you asked me if I could publish the source code. So today I’m releasing it to the world! Now, you can download an almost ready-to-use web interface [...]
It’s been a while, but today I come with some good news! As you may remember, I have developed a web interface for Blast+. Some of you asked me if I could publish the source code. So today I’m releasing it to the world! Now, you can download an almost ready-to-use web interface and install it on your web server to make it available to your users. I tried to make it as flexible as possible, so you can adapt it to your needs. RequirementsTo install this web interface, you’ll need a web server (Apache for example) with PHP >= 5.3.2 and a SQL database. For better performances, you also need a cluster with SGE job scheduler. Computing nodes should run on linux. The code is based on the new Symfony 2 framework, and it is available as “bundles” (which are a sort of Symfony plugins). The installation requires some PHP skills, but it shouldn’t be too hard if you follow the instructions below. InstallationGetting the codeThe first step is to get the Symfony 2 code. Go to the official download page and get the latest “Symfony Standard (.tgz)”. Extract it somewhere on your server and open a terminal in the symfony directory. Now you need to install some bundles. All our code is available on our github account. First install the GenouestBioinfoBundle following the instructions in the installation section of the corresponding documentation. You also need to install and configure GenouestSchedulerBundle (doc) and GenouestBlastBundle (doc). Follow the installation and configuration sections in the corresponding documentations. Additionally, if you have a Biomaj server and you want to use it within the blast interface, install GenouestBiomajBundle (doc). If you have no idea what is a Biomaj server, just skip this optional step! Preparing the databaseBefore testing your installation, you need to prepare the database that will store informations about each blast job that will be launched using the web interface. First, create an empty database on your SQL server. Then configure the connection of your blast interface. Briefly, open the ‘app/config/parameters.ini’ file and fill the different connection parameters (database driver, hostname, user, password, database name). Now we need to create the SQL tables in the database. To do so, just launch the following command from the symfony root directory: php app/console doctrine:schema:update --force And that’s it! The database is ready to be used. Getting the web interface onlineTo test your application, just make sure that the ‘web’ directory is accessible from the internet. You can have a look at the Symfony documentation for more help. Personally, I prefer to install symfony in any directory, and then create a symbolic link in the apache www directory pointing to the Symfony “web/” dir. Suppose your Symfony is installed in /opt/myblastapp/ and the apache root directory is /apache/www/. You can create a symbolic link like this: ln -s /opt/myblastapp/ /apache/www/blast And now access your application using http://example.org/blast/ Getting help & contributingAs you see, there are some manipulations to install the web interface on your server, but it shouldn’t too hard to do if you have some PHP skills. As it is based on the Symfony 2 framework, it is really customizable. In case of problem, the Symfony documentation can help you: it is well written and covers most of the things you can do with this framework. The code is released under the French CeCILL license which is a GPL-like license. Don’t hesitate to submit bug or patches to our github repositories. Any comments are welcome! Blast+ and MEME updatesHi all, Just a quick post to tell you that I have updated our servers to the latest Blast+ and MEME versions. Blast+ 2.2.25+ The NCBI has released a few days ago the Blast+ 2.2.25+ version. I was particularly impatient to get it as there was a bug in 2.2.24+ version which caused some results [...]
Hi all, Just a quick post to tell you that I have updated our servers to the latest Blast+ and MEME versions. Blast+ 2.2.25+The NCBI has released a few days ago the Blast+ 2.2.25+ version. I was particularly impatient to get it as there was a bug in 2.2.24+ version which caused some results to be incomplete (you may have seen the warning message about that on our form). So this new version fixes this specific bug (and others), and brings some improvements which you can see in the changelog. Feel free to test our blast form and tell us if you have any problem with it. Speaking about this form, to answer a comment from one of my previous post: we’re not really planning to release the code right now. It is based on the symfony framework, using one of our plugins to submit jobs to our cluster (sfobManagerPlugin). We will probably port this code to the new Symfony2 architecture soon, so maybe one day we will release a BlastBundle for it? MEME 4.6.1MEME has also been updated to the brand new 4.6.1 version. As usual reading the release notes will tell you what’s new. It concerns mostly MEME-ChIP which is now available from the command line. It may be useful if you want to test it on our cluster: see this post if you don’t know how to use it from command line. Test it, and tell us if you have any problem with it! Bye MEME 4.6: MEME-ChIP and SpamoAs promised in one of my last post, I’ve just finished updating our MEME server to the fresh version 4.6.0. The main new features of this release is the addition of two new applications in the suite: MEME-ChIP and Spamo. MEME-ChIP As you may guess, MEME-ChIP is dedicated to… ChIP-Seq experiments! This tool is in [...]
As promised in one of my last post, I’ve just finished updating our MEME server to the fresh version 4.6.0. The main new features of this release is the addition of two new applications in the suite: MEME-ChIP and Spamo. MEME-ChIP
This tool is in fact a meta-tool that launch several analysis on a set of sequences. The good news is that the official website has a great tutorial explaining how it works. Briefly, the input data is a fasta file containing many sequences generated by ChIP-seq (or other technology producing the same kind of sequences). The first step is to find motifs in these sequences: two tools are launched in parallel: MEME and DREME. MEME is good for finding wider motifs than DREME. DREME is designed for shorter one. Once it has found a lot of motifs (hopefully), the next step is to compare them to public databanks of motifs, like Jaspar for example. This is done using TOMTOM. MEME-ChIP then launches a MAST search to find each motif site in the sequences you submitted. Finally AMA and AME are used to estimate the binding affinity of input sequences to each motif, and to find subtly enriched known binding motifs in your input sequences. So a new tool specialized in ChIP-seq data analysis. Spamo
You give it a set of sequences (typically ChIP-seq sequences) and a motif that is represented in some of these sequences. The third thing to specify is a databank of motifs like Jaspar or Uniprobe. Spamo searches for all motifs of the given databank near the motif sites in your sequences. So this tool can help you determine the presence of a known motif at a specific position near another one. Useful when studying transcription factor binding sites. It seems Spamo is still in a beta version, but I didn’t have any bug with it. That’s it with the 4.6 MEME release! New BLAST+ web formHi all, Today, let’s talk about the famous BLAST and its successor BLAST+! BLAST+ is available on the platform. At the beginning… …there was BLAST. It was published in 1990 and it is one of the most used bioinformatics tools. Many web form has been created around the world, the main ones being at the [...]
Hi all, Today, let’s talk about the famous BLAST and its successor BLAST+! BLAST+ is available on the platform. At the beginning……there was BLAST. It was published in 1990 and it is one of the most used bioinformatics tools. Many web form has been created around the world, the main ones being at the NCBI or at the EBI. In short, if you don’t already know what is BLAST: it compares sequences and allow you to find sequences that are similar to a given one. The main BLAST implementation comes from the NCBI, although other implementations were also released (WU-BLAST which was later renamed AB-BLAST and is not free, FSA-BLAST, …). BLAST comes in many flavours (blastn, blastp, blastx, tblastn, tblastx, psiblast, phiblast, megablast, …) which mainly differ in the type of sequences that are compared. BLAST+At the end of 2009, the NCBI published a complete rewrite of their BLAST: it is now called BLAST+. Their aim was to provide a faster implementation, easier to use, and providing comparable results. If you’re not using the command line, the main change you can see is the NCBI web interface that was completely revamped a few months ago. At the command line level, there were many changes. In fact they renamed all the binaries and options. The following picture perfectly illustrates this: I think they decided that it was time to change all the names once and for all. And I think they’re right: it’s much more usable now. For compatibility, there is a perl script that you can use to translate an old command line into the BLAST+ format. If you look at the publication, you’ll see the performance improvements. BLAST+ at GenOuestOn the platform, we have installed BLAST+ (but the legacy BLAST is still available of course). To use it with command line, just source it like this: “source /local/env/envblast+”, and then you can play with blastn, blastp and all their friends. If you prefer a web interface, we have created a new one using BLAST+. It is largely inspired from the NCBI form, the main differences being the available databanks, and the dedicated resources. We hope you will like it. It is already much better than our previous form. We should improve it with a surprise in a few weeks. Tell us if you have any problem with this new form! MEME overviewIt’s been too long since my last post, but finally, I’m back! I will present you the work I have done under the hood soon. But for now, let’s talk about Meme! Meme is a suite of tools for pattern matching, pattern discovery, and other pattern manipulations. It works with PSSM, and it is particularly [...]
It’s been too long since my last post, but finally, I’m back! I will present you the work I have done under the hood soon. But for now, let’s talk about Meme! Meme is a suite of tools for pattern matching, pattern discovery, and other pattern manipulations. It works with PSSM, and it is particularly designed for nucleic sequences, although some of the tools works with protein sequences too. Meme is installed on our platform, and it is available with a great web interface. It is also possible to us it with command line (source /local/env/envmeme), and with webservices (as usual with our Opal server). I am going to present you the most useful tools of this suite. You should also visit the MEME documentation which is quite helpful. MEME and MASTMEMEThe MEME tool (which gave its name to the whole suite) is dedicated to pattern discovery. It simply takes a set of sequences (protein or nucleic acid) and search for some pattern represented in some or all the sequences. You can specify the number of motifs to find, the minimum and maximum length of the motif(s), and if the motif(s) is present in all or some of your sequences. To use it, just go to this page! The results are in HTML format. They contain a list of the motifs found by the program. Each one is represented with a nice logo and is given a score (low score means high quality motif): ![]() An example motif found by MEME. The left one is on forwar strand, the right one is the same on reverse strand. Just after these logos, there are 4 buttons which allow you to launch further analysis using the MEME results: MAST, FIMO, GOMO and BLOCKS are available. After this, you can see the motif sites found in the sequences you gave to MEME: Of course you can download and view your motif in difference format: PSSM or PROSITE-like pattern, the latter being less expressive than the first one. MASTWhen you have found a motif in a set of sequences, the next thing you might want to do is to search for other sequences containing this motif, i.e. pattern matching. The MEME suite comes with a tool dedicated to this: MAST. You have the choice to directly launch a MAST search from a MEME result page, or to save a MEME motif and then upload it on the MAST form: On the MAST form, you only have to choose a motif, and a sequence database to look into -some bacteria genome for example). There is also an option very similar to the blast e-value threshold parameter: MAST gives a score to each hit it finds in the database, and it only show you hits having an e-value lower than a given threshold (10 by default). After launching the search, you get a representation of the search sequence(s), each found hit being highlighted. If you move your mouse hover each hit, MAST gives you the associated e-value and the position in the sequence. GLAM2 and GLAM2SCANGLAM2 does the same job as MEME (pattern discovery) except that it can discover motifs containing gaps. And GLAM2SCAN is the equivalent of MAST, with gap support. There is not much more to say about them: they work very similarly to MEME and MAST. GLAM2 has a few more options, in particular for the tuning of insertion and deletion costs. TOMTOMThe last main tool offered in the MEME suite, is TOMTOM. It aims to compare a given motif to databanks of publicly available motifs. Let’s suppose you have found a new motif with MEME or GLAM2, you might want to know if this motif has already been described by someone else in the world. TOMTOM will give you the answer. On the web form, you have to enter the motif you have found, and then select a databank where to look for similar motifs. And there is also an e-value threshold working the same way as MAST, GLAM2SCAN or blast. You can search in various databanks, especially JASPAR or TRANSFAC which are big databanks of transcription factor binding sites. There are other, usually smaller, databanks specific to some organisms (Drosophila for example) or domains. As I said at the beginning, the MEME suite is mainly aimed at nucleic sequences, so the databanks available in TOMTOM are nucleic ones, and more specifically banks of transcription factor binding sites. Other toolsSo we have seen MEME, MAST, GLAM2, GLAM2SCAN and TOMTOM in action. But MEME comes with some other tools: MCAST allows to search for cluster of motif sites in a given sequence. This can be helpful when you’re studying regulatory modules. GOMO is a pattern matching program which can help you to assign a function to a motif: you give it a motif, and it searches for genes located close to occurrences of this motif. After this, it automatically retrieves the GO terms keywords associated to these genes. So you get a list of GO annotations related to the motif you entered (if you don’t know GO terms, have a look at this page!). FIMO is very similar to MAST: when you give 3 motifs to MAST, it will search for sequences having at least these 3 motifs. With FIMO, you will get sequences containing at least one of the motifs you gave. There are other smaller utilities only available with command line. Take a look at the documentation to see if they can help you. MEME 4.6While writing this article, I found out MEME 4.6 has been released. It comes with two new tools (MEME-Chip and Spamo) which I haven’t tested yet. I’m going to update our MEME server in the next few days and I will probably write a new blog post about these new tools. Stay tuned! New website!Hi all! Great news Today we’re launching a new website! It’s available at www.drmotifs.org and it’s the new face of Dr Motifs. The aim of this new website is to have a new entry point to Dr Motifs,with more stable information and (we hope!) a friendlier interface. The homepage is focused on the two main [...]
Hi all! Great newsToday we’re launching a new website! It’s available at www.drmotifs.org and it’s the new face of Dr Motifs. The aim of this new website is to have a new entry point to Dr Motifs,with more stable information and (we hope!) a friendlier interface. The homepage is focused on the two main analysis you want to do with motifs: discovery and matching. Each one has a dedicated page (which is still incomplete, but will get better in the near future). One page I’d like to point you to is the tools page. It contains a full list of all the motif analysis tools installed on our platform, with a synthetic view of their specificities (motif type, nucleic or proteic, etc). And of course a link to each web interface, and to related blog posts. I hope you will like this new website. Don’t hesitate to tell me if you have problems with it! The blogOf course, this blog will continue to live! In fact the website and the blog are complementary: stable, synthetic information on the website, and fresh, dynamic content for the blog. New versions: MEME and InterProScanHello all, Just a few words to keep you informed about the progress of the Dr Motifs project. InterProScan InterProScan 4.7 has been released a few days ago. We have installed this new version on our cluster. The main new feature is that now it uses hmmer 3 for Gene3D and Pfam databanks. This means [...]
Hello all, Just a few words to keep you informed about the progress of the Dr Motifs project. InterProScanInterProScan 4.7 has been released a few days ago. We have installed this new version on our cluster. The main new feature is that now it uses hmmer 3 for Gene3D and Pfam databanks. This means faster and better results hopefully! The next big release planned is 5.0 (no release date yet known). According to the authors, it will be a complete rewrite of the tool. Our web form is still available at iprscan.genouest.org and as a webservice on our Opal server. The EBI as also created a new web form. The results visualization seems more convenient. But I experienced a few instabilities while testing… I will test it more extensively when it will get more stable. MEME suiteWe have updated MEME to the brand new 4.5 version. The main change is about the TOMTOM application which allows to compare a motif against a database of known motifs from databanks like Jaspar. The submission form is now more usable and a proper webservice is available. We are also proud to appear now in the list of alternate servers for MEME! Web interface for the MEME suite is available here. More details on how to use this software suite will come in a one of my next articles. What’s nextIn the next weeks, I will be:
Just a word about the tools I am going to install: most useful publicly available tools are now deployed, I am going to focus on some tools developed by the Symbiose INRIA team we are working with. Sequence hammeringHi all, As promised in my last post, I’m going to show you what are the Hmmer tools and how you can use them for your sequence analysis. First, you have to know that Hmmer is a collection of tools dedicated to the manipulation of Hidden Markov Models. So it is particularly useful for studying [...]
Hi all, As promised in my last post, I’m going to show you what are the Hmmer tools and how you can use them for your sequence analysis. First, you have to know that Hmmer is a collection of tools dedicated to the manipulation of Hidden Markov Models. So it is particularly useful for studying complex motifs with subtle signals. And it is designed to work with protein sequences. Hmmer 3 is a suite of 12 tools (hmmalign, hmmbuild, hmmconvert, hmmemit, hmmfetch, hmmpress, hmmscan, hmmsearch, hmmsim, hmmstat, jackhmmer and phmmer). But don’t be afraid, with only 5 of them, you can already do great things! And there’s a good documentation for each tool on the official website. The most common things you can do are summarized in the following figure: So let’s have a look at the most important tools. From sequences to HMM: hmmbuildLet’s suppose you have a set of sequences and you think that they contain a common motif. With hmmer, you can build an HMM representing this motif. To do this, there are basically two steps:
Several alignment formats can be read by hmmbuild: aln (clustalw output format), or Stockholm for example. The output of hmmbuild is a text file representing the found HMM. How to use it?Hmmbuild is available on our platform: SOAP webservice on our Opal server Alternatively, the whole hmmer suite is available from command line on genocluster2: just issue the command “source /local/env/envhmmer-3.0″ (or “. /local/env/envhmmer-3.0.sh” if you use bash) and you’ll get access to the 12 hmmer tools. Blastp-like search: phmmer and jackhmmerHmmer comes with two other useful tools: phmmer and jackhmmer. They do the same kind of analysis as blastp and psi-blast respectively. Phmmer takes a protein sequence as input and search for similar sequences in a protein databank (NR for example). Jackhmmer do the same work as phmmer, except it repeats it iteratively. This means that, as psi-blast do, it launches phmmer a first time, then look at the results, select the best matches to the query sequences, build a new HMM from it and search again into the databank for new similar sequences. You can set the maximum number of iterations to be done (and if you set it to 1, it will do exactly as a normal phmmer search). You may wonder why you should use phmmer/jackhmmer (or not) instead of the traditional blastp/psi-blast? Performances is the answer: it seems that phmmer runs a bit faster than the good old blastp (well, I’ve only done a quick test, check it with your data). But keep in mind that it only works for proteic sequences. And you’re not guaranteed to get the same results as with blastp (scores are not identical). So try it and see if you’re happy with it! How to use it?Both tools are available on our platform: Phmmer on Mobyle and Jackhmmer on Mobyle Phmmer SOAP webservice and Jackhmmer SOAP webservice Alternatively, the whole hmmer suite is available from command line on genocluster2: just issue the command “source /local/env/envhmmer-3.0″ (or “. /local/env/envhmmer-3.0.sh” if you use bash) and you’ll get access to the 12 hmmer tools. Searching some known HMM in a new sequence: hmmscanDo you remember how InterProScan works? Hmmscan is quite similar: it takes as input a fasta sequence, and it searches in it any occurrences of HMM registered in specific databank. The main HMM databank used by hmmscan is Pfam. It is a databank of HMM representing protein families. In fact there are two sections in Pfam: Pfam-A which is a manually curated collection of protein families. Pfam-B is a bit lower quality as it contains families automatically generated. So the usual process is to search first using Pfam-A, and if you don’t get results, search using Pfam-B. InterProScan is doing the same king of thing, but it uses several tools to search within several databanks. In fact, when you use InterProScan, you already use hmmer. In the list of programs used by InterProScan, there is hmmpfam, which is the ancestor of hmmscan (in hmmer 2). How to use it?Hmmscan is available on our platform: Alternatively, the whole hmmer suite is available from command line on genocluster2: just issue the command “source /local/env/envhmmer-3.0″ (or “. /local/env/envhmmer-3.0.sh” if you use bash) and you’ll get access to the 12 hmmer tools. Searching for sequences using a HMM: hmmsearchHmmsearch does the opposite of hmmscan: you start from a HMM and then you search into sequence databanks for sequences containing the HMM your interested in. Using it is quite simple: just give a HMM and select a sequence databank (a proteic one) to search in. Instead of the sequence databank, you can also give a fasta file containing some specific sequences. The result is a list of matches with corresponding scores. How to use it?Hmmsearch is available on our platform: Alternatively, the whole hmmer suite is available from command line on genocluster2: just issue the command “source /local/env/envhmmer-3.0″ (or “. /local/env/envhmmer-3.0.sh” if you use bash) and you’ll get access to the 12 hmmer tools. HMM retrieving: hmmfetchHmmfetch is another tool which lets you retrieve a HMM from a databank (Pfam for example). You just have to give a list of HMM identifiers and hmmfetch will give you the whole HMM file. Identifiers can be for example PF00045 or Caudal_act. How to use it?Hmmfetch is available on our platform: Alternatively, the whole hmmer suite is available from command line on genocluster2: just issue the command “source /local/env/envhmmer-3.0″ (or “. /local/env/envhmmer-3.0.sh” if you use bash) and you’ll get access to the 12 hmmer tools. Other toolsOther hmmer tools are also available on the platform (web interfaces on Mobyle and SOAP webservices on Opal), though there are not very useful for most of the analysis. That’s it for today!
This page lets you control the data bookmarks stored on the
server.
Session usage:
|




